NAG Library Function Document

nag_regsn_mult_linear_addrem_obs (g02dcc)

 Contents

    1  Purpose
    7  Accuracy

1
Purpose

nag_regsn_mult_linear_addrem_obs (g02dcc) adds or deletes an observation from a general regression model fitted by nag_regsn_mult_linear (g02dac).

2
Specification

#include <nag.h>
#include <nagg02.h>
void  nag_regsn_mult_linear_addrem_obs (Nag_UpdateObserv update, Nag_IncludeMean mean, Integer m, const Integer sx[], double q[], Integer tdq, Integer ip, const double x[], Integer nr, Integer tdx, Integer ix, double y, const double wt[], double *rss, NagError *fail)

3
Description

nag_regsn_mult_linear (g02dac) fits a general linear regression model to a dataset. You may wish to change the model by either adding or deleting an observation from the dataset. nag_regsn_mult_linear_addrem_obs (g02dcc) takes the results from nag_regsn_mult_linear (g02dac) and makes the required changes to the vector c  and the upper triangular matrix R  produced by nag_regsn_mult_linear (g02dac). The regression coefficients, standard errors and the variance-covariance matrix of the regression coefficients can be obtained from nag_regsn_mult_linear_upd_model (g02ddc) after all required changes to the dataset have been made.
nag_regsn_mult_linear (g02dac) performs a QR  decomposition on the (weighted) X  matrix of independent variables. To add a new observation to a model with p  arguments the upper triangular matrix R  and vector c 1 , the first p  elements of c , are augmented by the new observation on independent variables in xT  and dependent variable y . Givens rotations are then used to restore the upper triangular form.
R : c 1 x y R * c 1 * y * 0  
To delete an observation Givens rotations are applied to give:
R c 1 R * c 1 * x y  
Note: only the R  and upper part of the c  are updated, the remainder of the Q  matrix is unchanged.

4
References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

5
Arguments

1:     update Nag_UpdateObservInput
On entry: indicates if an observation is to be added or deleted.
update=Nag_ObservAdd
The observation is added.
update=Nag_ObservDel
The observation is deleted.
Constraint: update=Nag_ObservAdd or Nag_ObservDel.
2:     mean Nag_IncludeMeanInput
On entry: indicates if a mean has been used in the model.
mean=Nag_MeanInclude
A mean term or intercept will have been included in the model by nag_regsn_mult_linear (g02dac).
mean=Nag_MeanZero
A model with no mean term or intercept will have been fitted by nag_regsn_mult_linear (g02dac).
Constraint: mean=Nag_MeanInclude or Nag_MeanZero.
3:     m IntegerInput
On entry: the total number of independent variables in the dataset.
Constraint: m1 .
4:     sx[m] const IntegerInput
On entry: if sx[j]  is greater than 0, then the value contained in x[ tdx × ix-1 + j ]  is to be included as a value of xT , an observation on an independent variable, for j=0,1,,m - 1.
Constraint: if mean=Nag_MeanInclude, then exactly ip-1  elements of sx must be > 0  and if mean=Nag_MeanZero, then exactly ip elements of sx must be > 0 .
5:     q[ip×tdq] doubleInput/Output
Note: the i,jth element of the matrix Q is stored in q[i-1×tdq+j-1].
On entry: q must be array q as output by nag_regsn_mult_linear (g02dac), nag_regsn_mult_linear_add_var (g02dec), nag_regsn_mult_linear_delete_var (g02dfc), or a previous call to nag_regsn_mult_linear_addrem_obs (g02dcc).
On exit: the first ip elements of the first column of q will contain c 1 * , the upper triangular part of columns 2 to ip+1  will contain R * , the remainder is unchanged.
6:     tdq IntegerInput
On entry: the stride separating matrix column elements in the array q.
Constraint: tdq ip + 1 .
7:     ip IntegerInput
On entry: the number of linear terms in general linear regression model (including mean if there is one).
Constraint: ip1 .
8:     x[nr×tdx] const doubleInput
On entry: the ip values for the dependent variables of the observation to be added or deleted, xT . The positions of the values x extracted depends on ix and tdx.
9:     nr IntegerInput
On entry: the number of rows of the notional two-dimensional array x.
Constraint: nr1 .
10:   tdx IntegerInput
On entry: the stride separating matrix column elements in the array x.
Constraint: tdxm .
11:   ix IntegerInput
On entry: the row of the notional two-dimensional array x that contains the values for the dependent variables of the observation to be added or deleted.
Constraint: 1 ix nr .
12:   y doubleInput
On entry: the value of the dependent variable for the observation to be added or deleted, y .
13:   wt[1] const doubleInput
On entry: if the new observation is to be weighted, then wt must contain the weight to be used with the new observation. If wt[0]=0.0 , then the observation is not included in the model. If the new observation is to be unweighted, then wt must be supplied as NULL.
Constraint: if the new observation is to be weighted wt[0]0.0 .
14:   rss double *Input/Output
On entry: the value of the residual sums of squares for the original set of observations.
Constraint: rss0.0 .
On exit: the updated values of the residual sums of squares.
Note: this will only be valid if the model is of full rank.
15:   fail NagError *Input/Output
The NAG error argument (see Section 3.7 in How to Use the NAG Library and its Documentation).

6
Error Indicators and Warnings

NE_2_INT_ARG_GT
On entry, ix=value  while nr=value . These arguments must satisfy ixnr .
NE_2_INT_ARG_LT
On entry, tdq=value  while ip + 1 = value. These arguments must satisfy tdq ip + 1 .
On entry, tdx=value  while m=value . These arguments must satisfy tdxm .
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, mean had an illegal value.
On entry, update had an illegal value.
NE_INT_ARG_LT
On entry, ip=value.
Constraint: ip1.
On entry, ix=value.
Constraint: ix1.
On entry, m=value.
Constraint: m1.
On entry, nr=value.
Constraint: nr1.
NE_IP_INCOMP_WITH_SX
On entry, for mean=Nag_MeanInclude, number of nonzero values of sx must be equal to ip-1 : number of nonzero values of sx=value , ip - 1 = value.
On entry, for mean=Nag_MeanZero, number of nonzero values of sx must be equal to ip: number of nonzero values of sx=value , ip=value .
NE_MAT_NOT_UPD
The R  matrix could not be updated: to, either, delete nonexistent observation, or, add an observation to R  matrix with zero diagonal element.
NE_REAL_ARG_LT
On entry, rss=value .
Constraint: rss0.0.
On entry, wt[0]=value
Constraint: wt[0]0.0.
NE_RSS_NOT_UPD
The rss could not be updated because the input rss was less than the calculated decrease in rss when the new observation was deleted.

7
Accuracy

Higher accuracy is achieved by updating the R  matrix rather than the traditional methods of updating X'X.

8
Parallelism and Performance

nag_regsn_mult_linear_addrem_obs (g02dcc) is not threaded in any implementation.

9
Further Comments

Care should be taken with the use of this function.
(a) It is possible to delete observations which were not included in the original model.
(b) If several additions/deletions have been performed you are advised to recompute the regression using nag_regsn_mult_linear (g02dac).
(c) Adding or deleting observations can alter the rank of the model. Such changes will only be detected when a call to nag_regsn_mult_linear_upd_model (g02ddc) has been made. nag_regsn_mult_linear_upd_model (g02ddc) should also be used to compute the new residual sum of squares when the model is not of full rank.
nag_regsn_mult_linear_addrem_obs (g02dcc) may also be used after nag_regsn_mult_linear_add_var (g02dec) and nag_regsn_mult_linear_delete_var (g02dfc).

10
Example

A dataset consisting of 12 observations with four independent variables is read in and a general linear regression model fitted by nag_regsn_mult_linear (g02dac) and parameter estimates printed. The last observation is then dropped and the parameter estimates recalculated, using nag_regsn_mult_linear_upd_model (g02ddc), and printed.

10.1
Program Text

Program Text (g02dcce.c)

10.2
Program Data

Program Data (g02dcce.d)

10.3
Program Results

Program Results (g02dcce.r)

© The Numerical Algorithms Group Ltd, Oxford, UK. 2017