A descriptive measure computed from the data in a population is called a parameter.
In practice, the values of parameters are usually not known. We will usually calculate statistics from data that we have sampled, and then, on the basis of the data in the samples, make claims about the parameters which describe the population from which we sampled the data.
The remainder of this section gives formula for calculating statistics and parameters. The notation is different, and the formulas for measures of spread differ slightly for samples and populations.
The sum of all the observations ------------------------------- The number of observationsThe notation for the sample mean is an x with a bar over it.
The notation for the population mean is the Greek letter mu.
The mean is the "balancing point" of a group of numbers.
Example: The mean of the numbers
4 6 2 9 2is 23/5 = 4.6.
It is usually appropriate to round off the value for the mean with one more place of accuracy than the original data.
The median will be the (n+1)/2 number in a list, after they have been put in order.
Example: The median of the numbers
4 6 2 9 2is 4, since 4 is the middle number after they have been ordered.
Also, (5+1)/2 = 3, and 4 is the third number in the ordered list.
Example:
4 5 6has a mean of 5, a typical value in this sample, while
4 5 600has a mean of 203, which is not very typical of its sample.
The median is robust to outliers, and its value can almost always be thought of as being typical. The median in both examples above is 5.
The mean and the median each are different measures of the center of a distribution. If the distribution is symmetric, then they will be in the same place. if the distribution is skewed to the right, then the mean will be larger than the median. If the distribution is skewed to the left, then the mean will be smaller than the median.
An advantage of the median over the mean, is that it is less susceptible to the effects of outliers, and is thus more likely to be close to a "typical" value for skewed distributions.
An advantage of the mean over the median, is that it is easier to compute, since it depends only on the sum of the data, not the entire set of data. With large sets of data, it is much faster to compute the mean than the median on a computer.
Also, the mean allows one to find the total.
Example: If the mean of ten numbers is 15.7, then the total of the numbers is 157.
If the median of ten numbers is 15.7, then we cannot specify the total.
Bret Larget, larget@mathcs.duq.edu