The coefficient of determination, denoted as R2, is a measure of strength of a given correlation. The value will fall between 0 and 1, with a larger number representing a stronger correlation. There are three ways to calculate the coefficient of determination, though each is not guaranteed to produce the same value. In the scope of linear regressions, however, the values will be the same.
Given the correlation coefficient r, the coefficient of determination is simply r2. Knowing that r falls between -1 and 1, we may easily confirm that R2 falls between 0 and 1.
As an alternative to squaring the correlation coefficient, R2 can be calculated as the ratio of the explained variance to the total variance, or similarly as the complement of the ratio of the unexplained variance to the total variance. (The explained variance is the summation of the squares of the differences of the predicted values and the mean, while the unexplained variance is the summation of the squares of the differences of the true values and the predicted values. For a more intuitive definition, Wikipedia’s discussion on the subject features symbolic and graphical representations to aid the reader.)
Since explained and unexplained variances add up to the total variance, the definition of R2 above is simply a ratio, explaining its range from 0 to 1. Of the total variance, a certain portion is “explained” by the inherent variance of the data, i.e. variance of the regression from the mean. The remainder of the variance is “unexplained” and is unwanted, as it is attributed to the inaccuracy of the regression itself, i.e. variance of the regression from the true values. As the explained variance gets closer to the total variance, the value of R2 approaches 1 and the regression becomes a better fit. This discussion offers an illustration of this concept.
For a complete, step-by-step breakdown of all that goes into calculating the coefficient of determination, this YouTube video is a great explanation (albeit very slow-paced).