H0: mu1 = mu2 = ... mug
Ha: all mui are not equal
by calculating an F test statistic to compare to an F distribution. The F test statistic is a ratio of a measure of variabilty among sample means to a measure of variability within samples. F statistics significantly larger than one are evidence in favor of different population means.
g = number of groups
ni = sample size of ith group
N = sum of the g sample sizes (total number of observations)
xbari = mean of ith sample
xbar = grand mean (mean of all observations)
si = standard deviation of ith sample
SSamong = sum of squares among sample means
SSwithin = sum of squares within samples
dfamong = degrees of freedom among sample means
dfwithin = degrees of freedom within samples
MSamong = mean square among sample means
MSwithin = mean square within samples
The sums of squares are the most tedious objects to calculate from the raw data.
SSamong = sum ( ni (xbari - xbar)2 ) is a weighted measure of the squared deviations of the sample means from the grand mean with weights equal to the sample sizes.
SSwithin = sum ( (ni-1)si2 ) is a weighted measure of the squared deviations of individual observations from their sample means.
SStotal = SSamong + SSwithin is a measure of the sum of squared deviations from the individual observations to the grand mean.
dfamong = g-1 is the degrees of freedom among the g sampel means.
dfwithin = N-g = sum ( ni - 1 ) is the sum of the degrees of freedom within each sample.
Each mean square is the sum of squares divided by the corresponding degrees of freedom.
The F statistic is
F = MSamong / MSwithin
The p-value is the area to the right of the F statistic under an F distribution with g-1 and N-g degrees of freedom.
Df Sum of Sq Mean Sq F Value Pr(F) birdSpecies 5 42.93965 8.58793 10.3877 3.152104e-08 Residuals 114 94.24835 0.82674
In the style of the textbook, it would be like this.
SS df MS F p-value ------------------------------------------------- among 42.93965 5 8.58793 10.3877 3.152104e-08 within 94.24835 114 0.82674 ------------------------------------------------- total 137.188 119
We can estimate the size of a typical deviation of an observation from its sample mean.
sqrt ( MSwithin ) = sqrt( 0.82674 ) = 0.91. Notice that this is in the range of the six sample standard deviations.
The small p-value indicates very strong support against the null hypothesis. There is substantial evidence that the mean cuckoo bird egg sizes are not all the same for the different subpopulations.
Bret Larget, larget@mathcs.duq.edu