Math 225 Course Notes

Return to the Math 225 Homepage

Chapter 9

Contents


The Big Picture

Regression is a general technique for analyzing multi-variate data. In this course, we will restrict discussion to the case where we wish to predict a quantitative response variable on the basis of a single quantitative explanatory variable. Furthermore, we will assume that the relationship between the two variables is summarized well by a straight line. This does not mean that all observed data must be exactly on a line. It does mean that a straight line through the center of a scatterplot of the points is a good description of the trend of the data, and is not substantially improved by drawing a curve through the data.

Correlation is a measure of the linear relationship between two variables, that is measured on a scale from -1 to 1. The strength of the relationship increases as the correlation moves away from zero.

Correlation and ordinary least squares regression (OLS) are intimately related to one another. OLS is one of several methods available for finding the "best" line to describe a set of points. It has theoretical and computational advantages that make it the most-used method. The result of OLS is a fitted regression line. This line can be found from the means and standard deviations of each variable and the correlation coefficient.

Analyzing data by regression without a computer is very tedious. In an example, we will learn how to pull information from the regression output of a statistics software package.



Last modified: April 16, 1996

Bret Larget, larget@mathcs.duq.edu