2.1.2 Algorithm for Tolerance Intervals

Tolerance intervals are a statistical interval, and with some confidence level, a specified minimum sampled proportion of a population is expected to fall within this range. For a given minimum percentage of population (P) in interval and confidence level (1-\alpha), tolerance interval provides the limits, and at least a certain proportion (P) of the population falls between the limits with the confidence level (1-\alpha).

The type of tolerance interval or bound includes Two-Sided and One-Sided (Lower Bound and Upper Bound).

  • Two-Sided: Tolerance interval with both lower and upper bound.
  • One-Sided: Include only lower bound and only upper bound, and the former is the value that the minimum percentage of population is likely to be greater than, but be less than for the later.

Both parametric and nonparametric tolerance intervals are computed. For nonparametric tolerance intervals, it only assumes that the parent distribution is continuous, and it is a distribution free method, that does not depend on the parent population of the sample. Meanwhile, parametric tolerance intervals are calculated assuming the parent distribution is one of the following ones:

  • Normal
  • Lognormal
  • Gamma
  • Exponential
  • Smallest Extreme Value
  • Weibull
  • Largest Extreme Value
  • Logistic
  • Loglogistic

Normal

For confidence level of (1-\alpha) and minimum percentage of population (P) in the interval (P is also called the coverage of the tolerance interval), the exact tolerance intervals, lower limit L and upper limit U, are calculated by the following equations:

L=\bar{X}-kS
U=\bar{X}+kS

where

\bar{X}: the mean of the samples.
k: the tolerance factor (k-factor).
S: the standard deviation of the samples.

Tolerance Factor for One-Sided Intervals

The exact tolerance factor for one-sided interval is computed by:

k=\frac{t_{n-1,1-\alpha}(\delta)}{\sqrt{n}}, and \delta=Z_P\sqrt{n}

where

t_{n-1,1-\alpha}(\delta): the 1-\alpha percentile of the noncentral t-distribution with n-1 degrees of freedom, and \delta is the noncentrality parameter.
Z_P: the P^{th} percentile of the standard normal distribution.
n: the number of observations.

Tolerance Factor for Two-Sided Intervals

The exact tolerance factor for two-sided interval is root (k) of the following equation:

\sqrt{\frac{2n}{\pi}}\int_{0}^{\infty}\left(1-F_{n-1}\left(\frac{(n-1)\chi_{1,P}^2(z^2)}{k^2}\right)\right)e^{-\frac{1}{2}nz^2}dz=1-\alpha

where

n: the number of observations.
F_{n-1}: the cumulative distribution function for a chi-square distribution with n-1 degrees of freedom.
\chi_{1,P}^2(z^2): the P^{th} percentile of the noncentral chi-square distribution with 1 degree of freedom, and z^2 is the noncentrality parameter.

Lognormal

The tolerance interval for the lognormal distribution is calculated following the process below:

  1. Take the natural logarithm of the original data.
  2. Compute the tolerance intervals for the transformed data by the same procedure for the Normal distribution.
  3. Exponentiate the limits of the tolerance intervals obtained in the previous step, so to get the tolerance intervals of the original data.

Gamma

The tolerance interval for the gamma distribution is calculated following the process below:

  1. Take the cubic root of the original data.
  2. Compute the tolerance intervals for the transformed data by the same procedure for the Normal distribution.
  3. Cubic power the limits of the tolerance intervals obtained in the previous step, so to get the tolerance intervals of the original data.

Exponential

For confidence level of (1-\alpha) and minimum percentage of population (P) in the interval (P is also called the coverage of the tolerance interval), the tolerance intervals, lower limit L and upper limit U, are calculated differently between one-sided and two-sided.

One-Sided Tolerance Intervals

L=-\frac{2n\bar{X}\ln(P)}{\chi_{2n,1-\alpha}^2}
U=-\frac{2n\bar{X}\ln(1-P)}{\chi_{2n,\alpha}^2}

where

n: the number of observations.
\bar{X}: the mean of the samples.
\chi_{2n,\alpha}^2: the \alpha^{th} percentile of the chi-square distribution with 2n degrees of freedom.

Two-Sided Tolerance Intervals

L=k_1\bar{X}
U=k_2\bar{X}

where

k_2=-\ln(1-\exp(-k_1))
n: the number of observations.
\bar{X}: the mean of the samples.

And k_1 is the solution to the system of equations below:

F_{2n}(2ny_2)-F_{2n}(2ny_1)=1-\alpha
\exp(-k_1y_1)-(1-\exp(-k_1))^{y_1}=P, y_1>0
\exp(-k_1y_2)-(1-\exp(-k_1))^{y_2}=P, y_2>0

where

F_{2n}(\cdot): the cumulative distribution function of the chi-square distribution with 2n degrees of freedom.

Smallest Extreme Value

For confidence level of (1-\alpha) and minimum percentage of population (P) in the interval (P is also called the coverage of the tolerance interval), the tolerance intervals, lower limit L and upper limit U, are calculated as followed.

One-Sided Tolerance Intervals

L=\hat{\mu}-k_1\hat{\sigma}
U=\hat{\mu}+k_2\hat{\sigma}

where

\hat{\mu}: the estimated location parameter of smallest extreme value distribution.
\hat{\sigma}: the estimated scale parameter of smallest extreme value distribution.
k_1=-x is the lower tolerance factor. And x is the unique root of the following function.
G(x;z) = 1-\alpha, and
G(x;z)=C_z\int_0^\infty\frac{t^{n-2}\exp\left((t-1)\sum_{i=1}^nz_i\right)IG_n\left(\exp(\lambda_P-xt)\sum_{i=1}^n\exp(z_it)\right)}{\left(\frac{1}{n}\sum_{i=1}^n\exp(z_it)\right)^n}dt
k_2: the upper tolerance factor. It is computed by replacing \alpha with 1-\alpha and P with 1-P in the functions for calculating k_1.

where

\lambda_P=\ln(-\ln(P))
C_z: a normalizing constant, and C_z^{-1}=\int_0^\infty\frac{t^{n-2}\exp\left((t-1)\sum_{i=1}^nz_i\right)}{\left(\frac{1}{n}\sum_{i=1}^n\exp(z_it)\right)^n}dt
IG_n(x)=\frac{\int_0^xt^{n-1}e^{-t}}{\Gamma(n)}dt: the incomplete gamma function.
z_i=\frac{x_i-\hat{\mu}}{\hat{\sigma}}: the centered observations based on the estimated location and scale parameters of the smallest extreme value distribution.
n: the number of observations.

Two-Sided Tolerance Intervals

Replace \alpha by \alpha/2 and P by (P+1)/2 in the formulas for calculating the one-sided tolerance intervals above to get the two-sided smallest extreme value tolerance intervals.

Weibull

The tolerance interval for the weibull distribution is calculated following the process below:

  1. Take the natural logarithm of the original data.
  2. Compute the tolerance intervals for the transformed data by the same procedure for the Smallest Extreme Value distribution.
  3. Exponentiate the limits of the tolerance intervals obtained in the previous step, so to get the tolerance intervals of the original data.

Largest Extreme Value

The tolerance interval for the largest extreme value distribution is calculated following the process below:

  1. Multiply the original data by -1.
  2. Compute the tolerance intervals for the transformed data by the same procedure for the Smallest Extreme Value distribution.
  3. Multiply the limits of the tolerance intervals obtained in the previous step by -1 again, so to get the tolerance intervals of the original data.

Logistic

For confidence level of (1-\alpha) and minimum percentage of population (P) in the interval (P is also called the coverage of the tolerance interval), the tolerance intervals, lower limit L and upper limit U, are calculated as followed.

One-Sided Tolerance Intervals

L=\hat{\mu}-K_L(\alpha, P)\hat{\sigma}
U=\hat{\mu}+K_U(\alpha, P)\hat{\sigma}
K_L(\alpha, P)=z_\alpha\sqrt{C_{11}+(q_P)^2C_{22}-2q_PC_{12}}+q_P
K_U(\alpha, P)=z_\alpha\sqrt{C_{11}+(q_P)^2C_{22}+2q_PC_{12}}+q_P

where

K_L(\alpha, P): the lower tolerance factor.
K_U(\alpha, P): the upper tolerance factor.
\hat{\mu}: the estimated location parameter of logistic distribution.
\hat{\sigma}: the estimated scale parameter of logistic distribution.
z_{\alpha}: the upper \alpha percentile of the standard normal distribution.
q_P=\log(P)-\log(1-P): the P\times100 lower percentile of the standard logistic distribution.
C_{11} = \frac{Var(\hat{\mu})}{\hat{\sigma}^2},C_{22}=\frac{Var(\hat{\sigma})}{\hat{\sigma}^2},C_{12}=\frac{Cov(\hat{\mu},\hat{\sigma})}{\hat{\sigma}^2}

Two-Sided Tolerance Intervals

Replace \alpha by \alpha/2 and P by (P+1)/2 in the formulas for calculating the one-sided tolerance intervals above to get the two-sided logistic tolerance intervals.

Loglogistic

The tolerance interval for the loglogistic distribution is calculated following the process below:

  1. Take the natural logarithm of the original data.
  2. Compute the tolerance intervals for the transformed data by the same procedure for the Logistic distribution.
  3. Exponentiate the limits of the tolerance intervals obtained in the previous step, so to get the tolerance intervals of the original data.

Nonparametric

For confidence level of (1-\alpha) and minimum percentage of population (P) in the interval (P is also called the coverage of the tolerance interval), the tolerance intervals, lower limit L and upper limit U, by nonparametric method, which is a distribution free method, do not depend on the parent population of the samples, and are calculated as followed.

Let X_1, X_2, ..., X_n be the ordered statistics based on random sample from some continuously distributed population F(X;\theta). Then

Pr\left(F(X_s;\theta)-F(X_r;\theta)\geq P\right)=1-B_{a,b}(P)

where

B: the cumulative distribution function of the beta distribution with parameters a=r and b=n-s+1.

Because the coverage of the intervals have a beta distribution with known parameter values, and these values are not dependent of the distribution of the parent population, F(X;\theta). Thus (X_r, X_s) is a distribution-free tolerance intervals.

One-Sided Intervals

Consider the following:

P(Y\geq k)\geq 1-\alpha

where

k: the largest integer that satisfies the inequality.
Y: a binomial random variable, and parameters are n and 1-P.
n: the number of observations.

Then the lower tolerance bound (L) and the upper tolerance bound (U) are given by:

L=X_k
U=X_{n-k+1}

The actual or effective coverage is given by P(Y\geq k).

Two-Sided Intervals

Consider the following:

P(V\leq k-1)\geq 1-\alpha

where

k: the smallest integer that satisfies the inequality.
V: a binomial random variable, and parameters are n and P.
n: the number of observations.

Thus,

k-1 = F_v^{-1}(1-\alpha)

where

F_v^{-1}(X): the inverse cumulative distribution function of V.

Choose s=n-r+1, r=(n-k+1)/2, then the tolerance intervals are given by:

L=X_r
U=X_s

Note that both r and s are rounded down to the nearest integer.

The actual or effective coverage is given by P(V\leq k-1).

Goodness-of-fit Test

The Anderson-Darling statistics is used to perform the goodness-of-fit test. For each distribution, the modified Anderson-Darling goodness-of-fit test statistics is computed.