# Scatterplots

A scatter plot is a graph representing the relationship between two quantitative variables. Each data point is represented by a dot on a cartesian plane with its position determined by its value of the two variables. For example, if I were to create a scatter plot using the variables of height and weight, the position of my data points would be (weight value, height value) on the cartesian coordinate plane. The x axis of a scatter plot is generally the independent variable, while the y axis is the dependent.

Scatter plots can be used to explore the correlation between two variables. Correlation is determined by the line of best fit of the data. There are three categories of correlation to watch out for:

• No correlation-there is no line that best models the data.
• Positive correlation-the line of best fit has a positive slope.
• Negative correlation-the line of best fit has a negative slope.

While scatter plots are useful to determine correlations between variables, it is absolutely imperative that we do not mistake correlation for causation. Even if a causality does exist between the two variables, we cannot prove that just from a correlation.

For a simple explanation of scatter plots and a simple example of how scatter plots are used, visit this site. Some people may mistake scatter plots with line graphs, this website describes both and the differences between the two (be wary, this site gives incorrect information on correlation, though that is not the subject of this post). For even more information about scatter plots, you should definitely visit here.