Math 225 Course Notes
Return to the Math 225 Homepage
Chapter 7
Contents
Hypothesis testing is a formal way of using data and statistical reasoning
to answer whether or not statements about populations are plausible.
The basic procedure is to:
- Assume something specific about a population.
As an example, assume that the mean effectiveness of a new drug
is the same as the mean effectiveness of an old drug.
- Find a test statistic and the corresponding sampling distribution.
In our example, we would want to compare the effectiveness of the two drugs
on two different samples by comparing sample means.
Under the hypothesis of no difference in effectiveness,
we expect the difference in the sample means to be nearly zero,
although there will almost certainly be some chance variation.
Knowledge of the sampling distribution for the difference in sample means
will allow us to determine whether or not an observed deviation from zero
is consistent with chance variation, or a truly unusual and rare occurance.
- Find a p-value.
A p-value is the chance that,
were we to repeat the experiment on different randomly chosen samples,
we would again get a result at least as extreme as the one actually observed.
A small p-value indicates that something rare happened ...
if we maintain our belief in our assumed hypothesis.
This is evidence against the assumed hypothesis since rare events happen
rarely.
Alternatively, a large p-value indicates that what was observed
could be reasonably be explained by chance variation.
This does not confirm that the assumed hypothesis is true,
but merely indicates that it is one (of several) decent explanations
for the observed data.
Deciding whether a p-value is "large" or "small" is certainly subjective,
and should be determined by the context of the problem
and the consequences of any decisions based on this determination.
It is common (but questionable) practice in the health sciences
to compare p-values to arbitrary fixed significance levels
such as .05 or .01.
When p-values fall below these levels,
results are called
statistically significant or
highly statistically significant.
In addition to considering the statistical significance of a result,
one should also look at its practical importance.
A large study might very well indicate that a drug is more effective
by resulting in a low p-value.
However, the increase in effectiveness, while real, may be of inconsequential
practical importance or not worth an increase in cost or side effects.
In contrast,
an observed difference might be of great practical importance,
but result from a small study not sufficiently powerful enough
to convincingly demonstrate that the result is not due to coincidental
chance variation.
Last modified: Mar 18, 1996
Bret Larget,
larget@mathcs.duq.edu