Displaying quantitative data
Data can be represented in multiple ways such as tables, bar graphs, histograms, and frequency plots. These methods provide different perspectives of the same information, helping to answer various questions about the data.
Frequency tables and dot plots
Dot plots and frequency tables are two of the handy tools in data representation. Frequency tables show how often each value appears, while dot plots provide a visual depiction of this information.
The frequency table organizes the data nicely. At a glance, we can tell that the most common number of goals (the mode) is 3.
The dot plot is very similar to the frequency table, but instead of using numbers to show frequency, it uses dots. Each dot represents a data point.
Histogram
A histogram displays numerical data by grouping data into “bins” of equal width. Each bin is plotted as a bar whose height corresponds to how many data points are in that bin. Bins are also sometimes called “intervals”, “classes”, or “buckets”.
A histogram displays the shape and spread of continuous sample data and it is one of the most used ways of representing statistical data.
Stem and Leaf Plot
A stem and leaf plot displays data by breaking each data point into two parts: the “stem” and the “leaf.” Each stem represents a range of values (e.g. if the stem is 2, the corresponding leaves might be data points in the 20s). Each leaf typically represents the last digit of a data point (e.g. if the leaf is 6 next to a stem of 2, it could represent a data point whose value is 26).
For a dataset like this one: 14, 18, 20, 22, 27, 31, 50
A Stem and leaf plot would be:
1 | 4 8
2 | 0 2 7
3 | 1
4 |
5 | 0
Shapes of distributions
Some distributions are symmetrical, with data evenly distributed about the mean. Other distributions are “skewed,” with data tending to the left or right of the mean. As sometimes described as skewed distributions have tails.
And below are some terms that we’ll be encountering many times.
- Cluster, gap, peak, outlier
- Center, spread/variability
A quick comparison of distributions
Dot plots and box plots are useful for finding the median, while histograms are great for showing the number of values within a specific range. And a line graph is a way to visually represent data, especially that changes over time. In a line graph it is important to keep the scale accurate (for ins. y-axis must start from zero).
Disclaimer: Like most of my posts, this content is intended solely for educational purposes and was created primarily for my personal reference. At times, I may rephrase original texts, and in some cases, I include materials such as graphs, equations, and datasets directly from their original sources.
I typically reference a variety of sources and update my posts whenever new or related information becomes available. For this particular post, the primary source was Khan Academy’s Statistics and Probability series.