Understanding Confidence Intervals
Prepared by: Edward Waltz, Ph.D.
University at Albany School of Public Health; email@example.com
Graphic 1: At least two measures are needed to describe the distribution of any statistical
(a) Measure of Central Tendency— i.e., where is the ‘center’ of the distribution ?
(b) Measure of Dispersion— i.e., how much variation exists in the distribution ?
Statisticians have devised a number of different ways of characterizing the central tendency and
dispersion of a distribution. Some of the more common of these are summarized in Graphic 1.
Most people have a commonsense understanding of the average (or mean); the median and
mode are somewhat less widely known.
Measures of dispersion can be as simple as the range. Most measures of variation that
we use are derived in some way. from the statistical variance. The standard deviation, for
example, is simply the square root of the variance. Ultimately in this training, we are interested
in understanding 95% confidence intervals.
Graphic 2: In this graphic we see that we can sample a large population multiple times. Each
of our samples has an average and a variance associated with it. If we aggregated all of the
averages from these samples, we can create a sampling distribution of the mean for the
population. The standard deviation of this distribution is more properly known as the standard
error of the mean (SEM or, simply, the standard error, SE).
Graphic 3: When our sample is based on at least 100 observations, the distribution of means
resembles a normal distribution. This allows us to use standard errors to calculate the 95%
confidence intervals. (See the manual for the technique used with smaller samples.) In a
normal distribution, the area under the curve is a function of the number of standard deviations
from the mean. For example, the range from one standard deviation below the mean to one
standard deviation above contains about 68% of the total area. Phrased another way, this
means that about 68% of