Integer isigma, Integer n, const double x[], double beta, double *theta, double *sigma, Integer maxit, double tol, double rs[], Integer *nit, Nag_Comm *comm, NagError *fail)

3

Description

The data consists of a sample of size

n

, denoted by

x_{1}, x_{2}, \dots, x_{n}

, drawn from a random variable

X

The

x_{i}

are assumed to be independent with an unknown distribution function of the form,

F ((x_{i} - θ) / σ)

where

θ

is a location argument, and

σ

is a scale argument.

M

-estimators of

θ

and

σ

are given by the solution to the following system of equations;

\begin{array}{lcl} \sum_{i = 1}^{n} ψ ((x_{i} - \hat{θ}) / \hat{σ}) & = & 0 \\ \sum_{i = 1}^{n} χ ((x_{i} - \hat{θ}) / \hat{σ}) & = & (n - 1) β \end{array}

where

ψ

and

χ

are user-supplied weight functions, and

β

is a constant. Optionally the second equation can be omitted and the first equation is solved for

\hat{θ}

using an assigned value of

σ = σ_{c}

The constant

β

should be chosen so that

\hat{σ}

is an unbiased estimator when

x_{i}

, for

i = 1, 2, \dots, n

has a Normal distribution. To achieve this the value of

β

is calculated as:

β = E (χ) = \int_{- \infty}^{\infty} χ (z) \frac{1}{\sqrt{2 π}} \exp \{\frac{- z^{2}}{2}\} d z

The values of

ψ (\frac{x_{i} - \hat{θ}}{\hat{σ}}) \hat{σ}

are known as the Winsorized residuals.

The equations are solved by a simple iterative procedure, suggested by Huber:

{\hat{σ}}_{k} = \sqrt{\frac{1}{β (n - 1)} (\sum_{i = 1}^{n} χ (\frac{x_{i} - {\hat{θ}}_{k - 1}}{{\hat{σ}}_{k - 1}})) {\hat{σ}}_{k - 1}^{2}}

and

{\hat{θ}}_{k} = {\hat{θ}}_{k - 1} + \frac{1}{n} \sum_{i = 1}^{n} ψ (\frac{x_{i} - {\hat{θ}}_{k - 1}}{{\hat{σ}}_{k}}) {\hat{σ}}_{k}

{\hat{σ}}_{k} = σ_{c}

σ

is fixed.

The initial values for

\hat{θ}

and

\hat{σ}

may be user-supplied or calculated within nag_robust_m_estim_1var (g07dbc) as the sample median and an estimate of

σ

based on the median absolute deviation respectively.

nag_robust_m_estim_1var_usr (g07dcc) is based upon function LYHALG within the ROBETH library, see Marazzi (1987).

4

References

Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley

Huber P J (1981) Robust Statistics Wiley

Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

5

Arguments

1: $chi$ – function, supplied by the userExternal Function

chi must return the value of the weight function

χ

for a given value of its argument. The value of

χ

must be non-negative.

The specification of chi is:

double

chi (double t, Nag_Comm *comm)

1: $t$ – doubleInput

On entry: the argument for which chi must be evaluated.

2: $comm$ – Nag_Comm *

Pointer to structure of type Nag_Comm; the following members are relevant to chi.

user – double *
iuser – Integer *
p – Pointer: The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by chi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.3.1.1 in How to Use the NAG Library and its Documentation).

Note: chi should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by nag_robust_m_estim_1var_usr (g07dcc). If your code inadvertently does return any NaNs or infinities, nag_robust_m_estim_1var_usr (g07dcc) is likely to produce unexpected results.

2: $psi$ – function, supplied by the userExternal Function

psi must return the value of the weight function

ψ

for a given value of its argument.

The specification of psi is:

double

psi (double t, Nag_Comm *comm)

1: $t$ – doubleInput

On entry: the argument for which psi must be evaluated.

2: $comm$ – Nag_Comm *

Pointer to structure of type Nag_Comm; the following members are relevant to psi.

user – double *
iuser – Integer *
p – Pointer: The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by psi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.3.1.1 in How to Use the NAG Library and its Documentation).

Note: psi should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by nag_robust_m_estim_1var_usr (g07dcc). If your code inadvertently does return any NaNs or infinities, nag_robust_m_estim_1var_usr (g07dcc) is likely to produce unexpected results.

3: $isigma$ – IntegerInput

On entry: the value assigned to isigma determines whether

\hat{σ}

is to be simultaneously estimated.

$isigma = 0$: The estimation of $\hat{σ}$ is bypassed and sigma is set equal to $σ_{c}$ .
$isigma = 1$: $\hat{σ}$ is estimated simultaneously.

4: $n$ – IntegerInput

On entry:

n

, the number of observations.

Constraint:

n > 1

5: $x [n]$ – const doubleInput

On entry: the vector of observations,

x_{1}, x_{2}, \dots, x_{n}

6: $beta$ – doubleInput

On entry: the value of the constant

β

of the chosen chi function.

Constraint:

beta > 0.0

7: $theta$ – double *Input/Output

On entry: if

sigma > 0

, theta must be set to the required starting value of the estimate of the location argument

\hat{θ}

. A reasonable initial value for

\hat{θ}

will often be the sample mean or median.

On exit: the

M

-estimate of the location argument

\hat{θ}

8: $sigma$ – double *Input/Output

On entry: the role of sigma depends on the value assigned to isigma as follows.

isigma = 1

, sigma must be assigned a value which determines the values of the starting points for the calculation of

\hat{θ}

and

\hat{σ}

. If

sigma \leq 0.0

, nag_robust_m_estim_1var_usr (g07dcc) will determine the starting points of

\hat{θ}

and

\hat{σ}

. Otherwise, the value assigned to sigma will be taken as the starting point for

\hat{σ}

, and theta must be assigned a relevant value before entry, see above.

isigma = 0

, sigma must be assigned a value which determines the values of

σ_{c}

, which is held fixed during the iterations, and the starting value for the calculation of

\hat{θ}

. If

sigma \leq 0

, nag_robust_m_estim_1var_usr (g07dcc) will determine the value of

σ_{c}

as the median absolute deviation adjusted to reduce bias (see nag_median_1var (g07dac)) and the starting point for

θ

. Otherwise, the value assigned to sigma will be taken as the value of

σ_{c}

and theta must be assigned a relevant value before entry, see above.

On exit: the

M

-estimate of the scale argument

\hat{σ}

, if isigma was assigned the value

1

on entry, otherwise sigma will contain the initial fixed value

σ_{c}

9: $maxit$ – IntegerInput

On entry: the maximum number of iterations that should be used during the estimation.

Suggested value:

maxit = 50

Constraint:

maxit > 0

10: $tol$ – doubleInput

On entry: the relative precision for the final estimates. Convergence is assumed when the increments for theta, and sigma are less than

tol \times \max (1.0, σ_{k - 1})

Constraint:

tol > 0.0

11: $rs [n]$ – doubleOutput

On exit: the Winsorized residuals.

12: $nit$ – Integer *Output

On exit: the number of iterations that were used during the estimation.

13: $comm$ – Nag_Comm *

The NAG communication argument (see Section 3.3.1.1 in How to Use the NAG Library and its Documentation).

14: $fail$ – NagError *Input/Output

The NAG error argument (see Section 3.7 in How to Use the NAG Library and its Documentation).

6

Error Indicators and Warnings

NE_ALLOC_FAIL: Dynamic memory allocation failed.
See Section 2.3.1.2 in How to Use the NAG Library and its Documentation for further information.
NE_BAD_PARAM: On entry, argument $〈value〉$ had an illegal value.
NE_FUN_RET_VAL: The chi function returned a negative value: $chi = 〈value〉$ .
NE_INT: On entry, $isigma = 〈value〉$ .
Constraint: $isigma = 0$ or $1$ .

On entry, $maxit = 〈value〉$ .
Constraint: $maxit > 0$ .

On entry, $n = 〈value〉$ .
Constraint: $n > 1$ .
NE_INTERNAL_ERROR: An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 2.7.6 in How to Use the NAG Library and its Documentation for further information.
NE_NO_LICENCE: Your licence key may have expired or may not have been installed correctly.
See Section 2.7.5 in How to Use the NAG Library and its Documentation for further information.
NE_REAL: On entry, $beta = 〈value〉$ .
Constraint: $beta > 0.0$ .

On entry, $tol = 〈value〉$ .
Constraint: $tol > 0.0$ .
NE_REAL_ARRAY_ELEM_CONS: All elements of x are equal.
NE_SIGMA_NEGATIVE: Current estimate of sigma is zero or negative: $sigma = 〈value〉$ .
NE_TOO_MANY_ITER: Number of iterations required exceeds maxit: $maxit = 〈value〉$ .
NE_ZERO_RESID: All winsorized residuals are zero.

7

Accuracy

On successful exit the accuracy of the results is related to the value of tol, see Section 5.

8

Parallelism and Performance

nag_robust_m_estim_1var_usr (g07dcc) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.

nag_robust_m_estim_1var_usr (g07dcc) makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Please consult the x06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9

Further Comments

Standard forms of the functions

ψ

and

χ

are given in Hampel et al. (1986), Huber (1981) and Marazzi (1987). nag_robust_m_estim_1var (g07dbc) calculates

M

-estimates using some standard forms for

ψ

and

χ

When you supply the initial values, care has to be taken over the choice of the initial value of

σ

. If too small a value is chosen then initial values of the standardized residuals

\frac{x_{i} - {\hat{θ}}_{k}}{σ}

will be large. If the redescending

ψ

functions are used, i.e.,

ψ = 0

|t| > τ

, for some positive constant

τ

, then these large values are Winsorized as zero. If a sufficient number of the residuals fall into this category then a false solution may be returned, see page 152 of Hampel et al. (1986).

10

Example

The following program reads in a set of data consisting of eleven observations of a variable

X

The psi and chi functions used are Hampel's Piecewise Linear Function and Hubers chi function respectively.

Using the following starting values various estimates of

θ

and

σ

are calculated and printed along with the number of iterations used:

(a)	nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, $σ$ is estimated simultaneously.
(b)	You must supply the starting values, $σ$ is estimated simultaneously.
(c)	nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, $σ$ is fixed.
(d)	You must supply the starting values, $σ$ is fixed.

NAG Library Function Document

nag_robust_m_estim_1var_usr (g07dcc)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

1

Purpose

2

Specification

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results