A least-squares regression line is used to assess the relationship between two quantitative variables. It is used to estimate the value of the response variable given an arbitrary value of the explanatory variable. There are four requirements that must be satisfied before a valid least-squares regression line can be calculated:
- The two variables are quantitative.
- A linear association exists between the two variables.
- No outliers
- Equal residuals (calculated by a residual plot)
A least-squares regression line is of the form , with and , where and are the means of and , respectively, and and are the standard deviations of and , respectively. For more on calculating least-squares regression line, visit this website.
It is important to be sure the fourth requirement for a least-squares regression line is satisfied. If the fourth requirement is not satisfied but a regression line is still calculated, an inaccurate correlation may be assumed. When a least-squares regression line is not appropriate, there are other options to model data. The article “Misuse of correlation and regression in three medical journals” covers other real-world examples of ways least-squares regression lines are misused.