17.7.2.3 Algorithms (Partial Least Squares)
PLS-Algorithm
Partial Least Squares is used to construct a model where there is a large number of correlated predictor variables or when the number of predictor variables exceeds the number of observations. In these cases, use of multiple linear regression techniques often fails to produce a predictive model, due to over-fitting. Partial least squares finds a use in modeling industrial processes and for such things as calibrating and predicting component amounts in spectral analysis.
Partial least squares extracts factors by linear combinations of predictor variables, and projects predictor variables and response variables onto the extracted factor space.
An observation containing one or more missing values will be excluded from the analysis, i.e. excluded in a listwise way.
Let numbers of observations, predictor variables and response variables be n, m and r respectively. Predictor variables are denoted by the matrix X with size of , and response variables by Y with size of . Subtract the mean from each column in matrices X and Y, and let them be and .
Each column in the matrix is divided by the standard deviation.
Partial Least Squares Method
Origin supports two methods to compute extracted factors: Wold's Iterative and Singular Value Decomposition (SVD).
Wold's Iterative
Use the initial vector u. If r=1, initialize u=Y, otherwise u can be a vector of random values.
- Repeat each iteration until w converges.
- , and normalize w by
- , and normalize t by
- , and normalize q by
-
After w converges, update
- , and normalize t by
-
-
-
- where w, t, u, p, q are x weights, x scores, y scores, x loadings and y loadings for the first factor.
- Repeat the above process with the residual matrices k times:
-
-
and k factors can be constructed. x weights, x scores, y scores, x loadings and y loadings for k factors can be denoted by matrices: W, T, U, P, and Q.
Note that in Origin signs of x scores, y scores, x loadings and y loadings for each factor are normalized by forcing the sum of x weights for each factor to be positive.
SVD
- X Weights for the First Factor
w is the normalized first left singular vector of , and,
- , and normalize t by
-
-
-
- Repeat the above process with the residual matrices k times.
And k factors can be extracted.
Cross Validation
Origin uses "leave-one-out" to find the optimal number of factors. It leaves out one observation each time and uses other observations to construct the model and predict responses for the observation.
PRESS is the predicted residual sum of squares. It can be calculated by:
-
- where is the predicted Y value by leave-one-out.
Note that if variables are scaled, PRESS is the scaled result.
If maximum number of factors is k, then it will calculate PRESS for 0, 1, ... k factors. For 0 factor,
-
- where is the mean value for jth Y variable.
Root mean PRESS is the root mean of PRESS. It is defined by:
-
Origin uses the minimum Root Mean PRESS to find the optimal number of factors in Cross Validation.
Response Prediction
Once the model is constructed, responses can be predicted by coefficients of the fitted model. Coefficients are calculated from weights and loadings matrix:
-
And the predicted responses are calculated as:
Note that here variables are centered. If variables are also scaled, responses should be scaled back.
Quantities
- Variance Explained for X Effects
Variance Explained for the lth X variable,
Variance Explained for X variables,
- Variance Explained for Y Responses
Variance Explained for the lth Y variable,
Variance Explained for Y variables,
VIP (variable influence on projections) explains each predictor variable using the mean variance in responses.
X Residuals,
Y Residuals,
When variables are scaled, residuals should be scaled back.
Distances to X model for the ith observation,
Distances to Y model for the ith observation,
T Square for the ith observation,
where is the variance for X scores of the jth factor.
- Control Limit for T Square
- Radius for Confidence Ellipse in Scores Plot
where is the variance for X scores or Y scores of the jth factor.
|