# Coefficient of Determination

The coefficient of determination is often referred to as $R^2$. The coefficient of determination, simply put, is the measure of how well a regression models a data set. If you have a data set that has an $R^2$ value of 0.95, then that means the regression explains 95% of the variation of the data, which is excellent. If you have a different data set that gives you an $R^2$ value of 0.5, then only 50% of the variation is explained by the regression, and that is not ideal.

There are three main ways of calculating $R^2$:

1. Calculate the square of the correlation coefficient (r). Since $-1 \leq r \leq 1$ then $0 \leq R^2 \leq 1$.
2. Calculate the ratio of the regression sum of squares and the total sum of squares: $R^2=\frac{\sum (\hat{y}-\bar{y})^2}{\sum (y-\bar{y})^2}$.
3. The total variation will always be 1 (or 100%), thus if you are given the unexplained variation you can calculate $R^2$ from the equation $R^2=1-\frac{\sum(\hat{y}-y)^2}{\sum(y-\bar{y})^2}$. Where 1 is the total variation and the ratio is the residual sum of squares of the total sum of squares which is the unexplained variation, also known as the coefficient of non-determination.

For a very basic definition, visit this web site for more information on $R^2$. This web site gives a very good, easy to understand definition and then goes more in depth than the first site; this is the best site for those new to statistics but want to develop a relatively deep understanding. Of course the Wikipedia page is usually a first stop in learning a new topic; however, I found the article to be a bit confusing for first exposure to this idea, though it is very expansive and covers much on the subject.