A buckets of colored balls model. We can imagine two buckets representing populations of urban students and rural students. Individuals who have had a flu shot are represented as red balls and those who have not with white. The proportions of red balls in the urban and rural population buckets are p_1 and p_2 respectively. We can take random samples of balls from the two buckets and count the number of red balls in each and make statistical inferences.
(estimate) ± (margin of error)
or
(estimate) ± (multiplier)(standard error)
The standard error is the standard deviation of the sampling distribution of the estimate.
p-hat1 - p-hat2 ± z* sqrt( p-hat1(1-p-hat1)/n1 + p-hat2(1-p-hat2)/n2 ).
Notice that we use the estimated sample proportions instead of the true population proportions in the standard error because we do not know them. The value of z* is chosen so that the area between -z* and z* is the desired confidence level. Some common choices are:
Confidence Level | z* |
---|---|
90% | 1.645 |
95% | 1.960 |
99% | 2.576 |
Example: Suppose that in a sample of 65 urban students, 52 have had a flu shot and in a sample of 65 rural students, 30 have had a flu shot. The two sample proportions are 0.800 and 0.462. (In general, use at least three significant digits for proportions and round off the final answers.)
Notice that in the urban group, 13 students have not had a flu shot and in the rural group, 35 have not. The numbers 52, 13, 30, and 35 are all greater than 5. The confidence intervals based on normal sampling distributions are valid.
A 95% confidence interval for the difference in population proportions is
(0.800 - 0.462) ± 1.96 sqrt( (0.800)(0.200)/65 + (0.462)(0.538)/65 )
or
0.34 ± 0.17
(It is generally good to round a margin of error up to two significant digits and then round the estimate to the same accuracy.) We can be 95% confident that the proportion of urban students with flu shots is between 17% and 51% higher than the proportion of rural students with flu shots.
In a study, 870 HIV patients were randomly assigned to two treatment regimes with 435 patients in each group. the first group received AZT while the second received a placebo. After a period of years, 17 individuals in the AZT group develop AIDS as compared to 38 individuals in the placebo group.
We see an observed difference of 21 individuals. The basic question is, if the two population proportions were equal, would a difference this large likely occur by chance alone?
We can measure the difference between what actually occurs and what we expect to occur by calculating the probability of seeing an outcome at least as extreme as what actually occurs if we were to do the entire experiment again assuming our original hypothesis is correct. This probability is called a p-value. The smaller the p-value, the more evidence there is that the null hypothesis is incorrect.
A hypothesis test then consists of these parts.
We can now apply these ideas to the example problem.
H0: p1 = p2
Ha: p1 < p2p
Under our null hypothesis, the two population proportions are equal. If this is true, our best guess is that the common p-bar = (17 + 38) / (435 + 435) or p-bar = 0.0632. If the null hypothesis is true, the test statistic
z = (p-hat1 - p-hat2) / sqrt( p-bar(1-p-bar)/n1 + p-bar(1-p-bar)/n2 ).
plugging, we find the test statistic z = (0.0391 - 0.0874)/sqrt((0.0632)(0.9368)/435 + (0.0632)(0.9368)/435) or z = -2.93.
The alternative hypothesis is p1 < p2. This is a one-sided test. The smaller the difference, the more evidence there is against the null hypothesis. If the null hypothesis is true, the probability of observing a difference as small as we actually observed is the area to the left of -2.93 under a standard normal curve, or 0.0017.
If AZT were no better than a placebo, we would only see AZT do this well compared to a placebo in fewer than 2 out of 1000 experiments. This is strong evidence that AZT improves the chance of not developing AIDS.
Bret Larget, larget@mathcs.duq.edu