The coefficient of determination is often referred to as . The coefficient of determination, simply put, is the measure of how well a regression models a data set. If you have a data set that has an value of 0.95, then that means the regression explains 95% of the variation of the data, which is excellent. If you have a different data set that gives you an value of 0.5, then only 50% of the variation is explained by the regression, and that is not ideal.
There are three main ways of calculating :
- Calculate the square of the correlation coefficient (r). Since then .
- Calculate the ratio of the regression sum of squares and the total sum of squares: .
- The total variation will always be 1 (or 100%), thus if you are given the unexplained variation you can calculate from the equation . Where 1 is the total variation and the ratio is the residual sum of squares of the total sum of squares which is the unexplained variation, also known as the coefficient of non-determination.
For a very basic definition, visit this web site for more information on . This web site gives a very good, easy to understand definition and then goes more in depth than the first site; this is the best site for those new to statistics but want to develop a relatively deep understanding. Of course the Wikipedia page is usually a first stop in learning a new topic; however, I found the article to be a bit confusing for first exposure to this idea, though it is very expansive and covers much on the subject.