Search Knowledge

© 2026 LIBREUNI PROJECT

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among group means in a sample. ANOVA was developed by statistician and evolutionary biologist Ronald Fisher. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the tt-test beyond two means.

While the tt-test is limited to comparing two groups, applying multiple tt-tests across several groups exponentially increases the Type I error rate (false positives). ANOVA controls this error rate by evaluating the entire set of groups simultaneously, partitioning the observed variance in a particular variable into components attributable to different sources of variation.

The Logic of Variance Partitioning

The fundamental mechanism of ANOVA is the partitioning of total variance into two primary components:

  1. Between-Group Variance: The variance of the group means around the grand mean. This reflects the effect of the independent variable(s) plus error.
  2. Within-Group Variance: The variance of individual scores around their respective group means. This reflects pure error (unexplained variance).

If the between-group variance is significantly larger than the within-group variance, it indicates that the independent variable has a significant effect on the dependent variable.

Assumptions of ANOVA

The validity of ANOVA relies on three core assumptions:

  1. Independence of Observations: The residuals must be mutually independent. This is fundamentally a design issue handled through random sampling and random assignment.
  2. Normality: The residuals of the model are normally distributed. While ANOVA is robust to moderate violations of normality (especially with large, equal sample sizes due to the Central Limit Theorem), severe skewness or outliers can compromise the FF-test.
  3. Homogeneity of Variances (Homoscedasticity): The variances of the populations from which the samples are drawn are equal. This is tested using Levene’s Test or Bartlett’s Test. Welch’s ANOVA can be used if this assumption is heavily violated.

One-Way ANOVA

A One-Way ANOVA involves a single independent variable (factor) with three or more categorical levels. The model for an observation yijy_{ij} (the ii-th observation in the jj-th group) is given by:

yij=μ+τj+εijy_{ij} = \mu + \tau_j + \varepsilon_{ij}

Where:

  • μ\mu is the grand mean.
  • τj\tau_j is the treatment effect for the jj-th group (where τj=0\sum \tau_j = 0).
  • εij\varepsilon_{ij} is the random error associated with the ii-th observation in the jj-th group, assumed to be N(0,σ2)\mathcal{N}(0, \sigma^2).

Hypotheses

The null hypothesis (H0H_0) states that all group population means are equal (or equivalently, all treatment effects are zero): H0:μ1=μ2==μkorτ1=τ2==τk=0H_0: \mu_1 = \mu_2 = \dots = \mu_k \quad \text{or} \quad \tau_1 = \tau_2 = \dots = \tau_k = 0

The alternative hypothesis (HaH_a) states that at least one population mean is different: Ha: i,j such that μiμjH_a: \exists \ i, j \text{ such that } \mu_i \neq \mu_j

Sums of Squares

The Total Sum of Squares (SSTSST) is partitioned into the Sum of Squares Between (SSBSSB) and the Sum of Squares Within (SSWSSW, also known as Error Sum of Squares, SSESSE).

SST=SSB+SSWSST = SSB + SSW

Total Sum of Squares (SST) measures the total variation in the data: SST=j=1ki=1nj(yijyˉ..)2SST = \sum_{j=1}^{k} \sum_{i=1}^{n_j} (y_{ij} - \bar{y}_{..})^2 where yˉ..\bar{y}_{..} is the grand mean.

Sum of Squares Between (SSB) measures the variation of group means around the grand mean: SSB=j=1knj(yˉ.jyˉ..)2SSB = \sum_{j=1}^{k} n_j (\bar{y}_{.j} - \bar{y}_{..})^2 where yˉ.j\bar{y}_{.j} is the mean of the jj-th group and njn_j is the number of observations in the jj-th group.

Sum of Squares Within (SSW) measures the variation of individual observations around their respective group means: SSW=j=1ki=1nj(yijyˉ.j)2SSW = \sum_{j=1}^{k} \sum_{i=1}^{n_j} (y_{ij} - \bar{y}_{.j})^2

Degrees of Freedom and Mean Squares

Degrees of freedom (dfdf) are required to convert sums of squares into variances (mean squares). Let NN be the total sample size and kk be the number of groups.

  • dfTotal=N1df_{Total} = N - 1
  • dfBetween=k1df_{Between} = k - 1
  • dfWithin=Nkdf_{Within} = N - k

The Mean Squares (MSMS) are calculated by dividing the Sum of Squares by their respective degrees of freedom:

MSB=SSBk1MSB = \frac{SSB}{k - 1} MSW=SSWNkMSW = \frac{SSW}{N - k}

The F-Statistic

The test statistic for ANOVA is the ratio of the Mean Square Between to the Mean Square Within. Under the null hypothesis, both MSBMSB and MSWMSW are independent estimates of the population variance σ2\sigma^2, so their ratio follows an FF-distribution with k1k-1 and NkN-k degrees of freedom.

F=MSBMSWF = \frac{MSB}{MSW}

If the FF-statistic is significantly larger than 1 (specifically, greater than the critical value from the FF-distribution for a given alpha level), the null hypothesis is rejected.

In a One-Way ANOVA with 4 groups and 40 total participants, what are the degrees of freedom for the F-statistic (numerator and denominator)?

Evaluating Teaching Methods

A university aims to determine if three different teaching methods (Standard Lecture, Flipped Classroom, Problem-Based Learning) result in different final exam scores. 90 students are randomly assigned to the three methods (30 per method). The resulting Sum of Squares Between (SSB) is calculated as 450, and the Sum of Squares Within (SSW) is 2610.

Calculate the Mean Square Between (MSB) and Mean Square Within (MSW).

Two-Way ANOVA

A Two-Way ANOVA analyzes the effect of two independent categorical variables (factors) on a continuous dependent variable. It fundamentally differs from running two independent One-Way ANOVAs because it evaluates the interaction effect between the two variables.

The statistical model for a Two-Way ANOVA with factors AA and BB, fixed effects, and with replication (nn observations per cell) is:

yijk=μ+αi+βj+(αβ)ij+εijky_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + \varepsilon_{ijk}

Where:

  • yijky_{ijk} is the kk-th observation in the ii-th level of factor AA and jj-th level of factor BB.
  • μ\mu is the overall population grand mean.
  • αi\alpha_i is the main effect of factor A at level ii.
  • βj\beta_j is the main effect of factor B at level jj.
  • (αβ)ij(\alpha\beta)_{ij} is the interaction effect between level ii of A and level jj of B.
  • εijk\varepsilon_{ijk} is the random error term, N(0,σ2)\sim \mathcal{N}(0, \sigma^2).

Interaction Effects

An interaction effect occurs when the effect of one independent variable on the dependent variable changes depending on the level of the other independent variable. Graphically, this is observed when the lines representing the means across levels of factors are not parallel (they may cross or diverge).

If the interaction effect is significant, interpreting the main effects (the individual effects of factor AA and factor BB) becomes highly nuanced, as the main effects no longer fully describe the relationship.

Sums of Squares for Two-Way ANOVA

In a balanced design (equal sample sizes in all cells), the total variance is partitioned into four orthogonal components:

SST=SSA+SSB+SSAB+SSESST = SSA + SSB + SSAB + SSE

Where:

  • SSA: Sum of Squares for Factor A
  • SSB: Sum of Squares for Factor B
  • SSAB: Sum of Squares for the Interaction
  • SSE: Sum of Squares for Error (Within)

Degrees of freedom are similarly partitioned: Let aa be the number of levels of Factor A, bb be the number of levels of Factor B, and nn the number of replicates per cell. Total observations N=a×b×nN = a \times b \times n.

  • dfA=a1df_A = a - 1
  • dfB=b1df_B = b - 1
  • dfAB=(a1)(b1)df_{AB} = (a - 1)(b - 1)
  • dfE=ab(n1)df_E = ab(n - 1)

Three distinct FF-tests are performed by dividing the corresponding Mean Square (MSA,MSB,MSABMSA, MSB, MSAB) by the Mean Square Error (MSEMSE):

FA=MSAMSE,FB=MSBMSE,FAB=MSABMSEF_A = \frac{MSA}{MSE}, \quad F_B = \frac{MSB}{MSE}, \quad F_{AB} = \frac{MSAB}{MSE}

In a Two-Way ANOVA, you are studying the effects of Diet (3 levels) and Exercise (2 levels) on weight loss. You have 10 participants per cell (6 cells total). What are the degrees of freedom for the interaction effect (Diet × Exercise)?

Post-Hoc Tests

A significant ANOVA only tells you that at least two means differ, not which means differ. To identify specific pairwise differences, post-hoc tests are required. Conducting multiple standard tt-tests inflates the family-wise error rate αFW\alpha_{FW} (the probability of making at least one Type I error across all tests).

αFW=1(1α)c\alpha_{FW} = 1 - (1 - \alpha)^c where cc is the number of comparisons. For 5 groups, there are c=5(4)2=10c = \frac{5(4)}{2} = 10 comparisons. If α=0.05\alpha=0.05 per test, the family-wise error rate jumps to 1(0.95)100.401 - (0.95)^{10} \approx 0.40 (assuming independence, which is an oversimplification but illustrates the inflation).

Common Post-Hoc Adjustments

  1. Tukey’s Honestly Significant Difference (HSD): Compares all possible pairs of means. It is based on the studentized range distribution (qq) and provides tight control over the family-wise error rate when sample sizes are equal.
  2. Bonferroni Correction: The most conservative method. It simply divides the desired family-wise alpha level by the number of comparisons: αcorrected=αc\alpha_{corrected} = \frac{\alpha}{c}. While it strictly prevents Type I errors, it severely impacts statistical power (increasing Type II errors).
  3. Scheffé’s Method: Used for all possible linear contrasts, not just pairwise comparisons. It is the most conservative post-hoc test when performing purely pairwise comparisons, but is highly flexible.

Which of the following correction methods is considered the most conservative and provides the lowest statistical power for detecting genuine differences?

Effect Size

The pp-value from an FF-test indicates statistical significance but not practical significance. Effect size metrics quantify the magnitude of the differences between groups.

Eta-Squared (η2\eta^2)

Eta-squared represents the proportion of total variance in the dependent variable that is associated with membership in the different groups defined by the independent variable.

η2=SSBSST\eta^2 = \frac{SSB}{SST}

While intuitive, η2\eta^2 is an upwardly biased estimator of the population effect size (it tends to overestimate).

Partial Eta-Squared (ηp2\eta_p^2)

In multi-factor designs (like Two-Way ANOVA), η2\eta^2 can be misleading because the effects of one factor reduce the variance available to be explained by another. Partial eta-squared isolates the variance explained by a specific factor relative to the unexplained variance (error) and the variance of that specific factor.

ηp2=SSeffectSSeffect+SSE\eta_p^2 = \frac{SS_{effect}}{SS_{effect} + SSE}

Omega-Squared (ω2\omega^2)

Omega-squared is a more complex but unbiased estimator of the population variance explained. It corrects for the bias present in η2\eta^2 by incorporating degrees of freedom and Mean Square terms.

ω2=SSBdfBetween×MSWSST+MSW\omega^2 = \frac{SSB - df_{Between} \times MSW}{SST + MSW}

Interpreting Effect Sizes in a Multi-Factor Design

A researcher conducts a Two-Way ANOVA assessing the impact of Drug Dosage (A) and Therapy (B) on symptom reduction. The output yields the following sums of squares: SSA = 400, SSB = 100, SSAB = 50, SSE = 450. Total SST = 1000.

Calculate the eta-squared (η²) for Drug Dosage (A).

Repeated Measures ANOVA

Repeated Measures ANOVA is the equivalent of the one-way ANOVA, but for related, not independent groups. It is the extension of the dependent (paired) tt-test. Examples include measuring the same participants across multiple time points (e.g., Blood pressure at baseline, week 1, and week 2) or exposing the same participants to all conditions in an experiment.

The key advantage of Repeated Measures ANOVA is that it removes variance attributable to individual differences from the Error Sum of Squares. This typically makes the analysis much more powerful (higher probability of detecting a true effect) than a standard independent-samples ANOVA.

SST=SSBetweenSubjects+SSWithinSubjectsSST = SS_{Between Subjects} + SS_{Within Subjects} SSWithinSubjects=SSTreatment+SSErrorSS_{Within Subjects} = SS_{Treatment} + SS_{Error}

The Assumption of Sphericity

Repeated measures designs require the assumption of Sphericity. Sphericity requires that the variances of the differences between all pairs of related groups are equal. It is evaluated using Mauchly’s Test of Sphericity.

If the assumption of sphericity is violated (Mauchly’s Test p<0.05p < 0.05), the Type I error rate inflates. To correct this, the degrees of freedom are adjusted downwards. Common corrections include:

  1. Greenhouse-Geisser Correction: The most conservative correction. Used when sphericity is severely violated (epsilon ϵ<0.75\epsilon < 0.75).
  2. Huynh-Feldt Correction: Less conservative, used when sphericity violation is mild (epsilon ϵ>0.75\epsilon > 0.75).

If ϵ\epsilon is close to 1, the sphericity assumption holds perfectly. The corrections effectively increase the critical FF-value required for significance by artificially reducing the degrees of freedom.

A researcher conducts a repeated measures ANOVA and finds that Mauchly's Test is highly significant (p < .001), yielding a Greenhouse-Geisser epsilon (ε) of 0.52. What action should be taken?

Previous Module Regression Analysis