17.4.1.6 Algorithms (Three-Way ANOVA)ThreeWayANOVA-Algorithm
Theory of Three-Way ANOVA
Suppose N observations are associated with three factors, say, factor A with I levels, factor B with J levels and factor C with K levels.
Let denotes the hth observation at level i of factor A, level j of factor B and level k of factor C, the three-way ANOVA model can be written as
where is the whole response data mean, is deviation at level i of factor A; is the deviation at level j of factor B, is the deviation at level k of factor C, is interaction term between factors A and B, is interaction term between factors A and C, is interaction term between factors B and C, is the interaction term among factors A and B and C, and is the error term.
In three-way ANOVA, users can specify their model. For example, they can exclude the term (if so, then the term is autonomously excluded at the same time), then their model would like this:
The sample variation of a specified model can be obtained through so-called "design matrix" method. Taking the full model for example, the brief procedure for this method is:
Degrees of Freedom (DF) for the whole model is . The whole design matrix is , where is the sub-design-matrix for , which is usually constructed by all "1", and other sub-design-matrices for what their subscripts stand. Let denotes X by replacing the corresponding sub-design-matrix with zeros, for instance,
Define
Then the sum of squares error would be
For full model, the ANOVA table is summarized as below:
Source of Variation
|
Degrees of Freedom (DF)
|
Sum of Squares (SS)
|
Mean Square (MS)
|
F Value
|
Prob > F
|
Factor A
|
I - 1
|
|
|
/
|
|
Factor B
|
J - 1
|
|
|
/
|
|
Factor C
|
K - 1
|
|
|
/
|
|
A*B
|
(I- 1) (J - 1)
|
|
|
/
|
|
A*C
|
(I- 1) (K - 1)
|
|
|
/
|
|
B*C
|
(J- 1) (K - 1)
|
|
|
/
|
|
A*B*C
|
(I- 1) (J - 1)(K - 1)
|
|
|
/
|
|
Error
|
=N-IJK
|
|
|
|
|
Total
|
N - 1
|
|
|
|
|
Multiple Means Comparisons
There are various methods for multiple means comparison in Origin, and we use the ocstat_dlsm_mean_comparison() function to perform means comparisons.
Two types of multiple means comparison methods:
Single-step method. It creates simultaneous confidence intervals to show how the means differ, including Tukey-Kramer, Bonferroni, Dunn-Sidak, Fisher’s LSD and Scheffé mothods.
Stepwise method. Sequentially perform the hypothesis tests, including Holm-Bonferroni and Holm-Sidak tests
Power Analysis
The power analysis procedure calculates the actual power for the sample data, as well as the hypothetical power if additional sample sizes are specified.
The power of a three-way analysis of variance is a measurement of its sensitivity. Power is the probability that the ANOVA will detect differences in the population means when real differences exist. In terms of the null and alternative hypotheses, power is the probability that the test statistic F will be extreme enough to reject the null hypothesis when it should be rejected actually (i.e. given the null hypothesis is not true).
The Origin Three-Way ANOVA dialog can compute powers for the Factor A, B and C sources. If the specified intersect terms are selected, Origin also can compute power for them.
Power is defined by the equation:
where f is the deviate from the non-central F-distribution with df and dfe degrees of freedom and nc = SS/MSE. SS is the sum of squares of the source A, B, C, A*B, A*C, B*C, or A*B*C, MSE is the mean square of the Errors, df is the degrees of freedom of the numerator, dfe is the degrees of freedom of the Errors. All values (SS, MSE, df, and dfe) are obtained from the ANOVA table. The value of probf( ) is obtained using the NAG function nag_prob_non_central_f_dist (g01gdc)
. See the NAG documentation for more detailed information.
All the above is a brief algorithm outline of three-way analysis of variation, for more information about the detail mathematical deduction, please reference to the corresponding part of the user's manual.
Levene test for Homogeneity of Variances
We use the following statistics to do Levene test.
where
N is the number of observation, is the number of subgroups with observation.
Then we can get the p-value, which is .
|