Author Archives: Sean Raleigh

A Bayesian perspective on interpreting statistical significance

This site illustrates why a P-value cannot be interpreted in a vacuum. Suppose a hypothesis test results in . Even if the null were true, we would still see statistically significant results by chance alone about 5% of the time. But this is relatively infrequent. So if we see a small P-value, we hope that the null […]

Anscombe’s Quartet

Here’s an interesting example of why you should never just look at an r value and a regression model without looking at the scatterplot first. Anscombe’s Quartet consists of four data sets. Each has the same sample size and nearly the same mean and standard deviation for the x and y variables, r values, and regression coefficients. But […]

Understanding assumptions and conditions

David Bock (one of the three coauthors of the textbook Intro Stats that we use in Math 150) has written an article about assumptions and conditions. His audience is AP Stats teachers, but we all can learn from what he has to say.

An interesting side-by-side boxplot

Here’s an interesting example of a side-by-side boxplot I came across today. It shows data on the length of Ph.D. dissertations for a variety of fields of study. It’s a little crowded, so it’s not the most elegant boxplot. Also, the coloration doesn’t really add much since the fields are simply listed in reverse alphabetical […]

Histograms

A histogram is a graphical display for a single quantitative variable. The range of the quantitative variable is divided into intervals called bins which are plotted on the x-axis. Then either the frequency (number) or relative frequency (percentage) of data values falling within each bin is represented by the height of the bar sitting over […]