1. Statistical Questions
We start with understanding what statistical and non statistical questions are.
- Statistics is about collecting, presenting, and analyzing data.
- Variability, a key concept in statistics, refers to how much data points differ from each other.
- Statistical questions require collecting data with variability to answer.
For example, asking about the average number of cars in a parking lot on Monday mornings is a statistical question.
Are the questions below statistical questions?
1 How much does my grapefruit weigh?
2 What is the average number of cars in a parking lot on Monday mornings?
3 Am I hungry?
4 How many teeth does my mother have?
5 How much time do the members of my family spend eating per year?
6 How many times have I watched Star Wars?
Answers:
1 That would be just one data point. N
2 We could collect multiple data points to answer. Y
3 That’s not data with variability. N
4 That would be just one data point. N
5 We would collect data with variability. Y
6 No variability here. N
Are the questions below statistical questions?
1 How old are you?
2 How old are the people who have watched this video in 2013?
3 Do dogs run faster than cats?
4 Do wolves weigh more than dogs?
5 Does your dog weigh more than that wolf?
6 Does it rain more in Seattle than Singapore?
7 What was the difference in rainfall between Singapore and Seattle in 2013?
8 In general, will I use less gas driving at 55 mph than 70 mph?
9 Do English professors get paid less than math professors?
10 Does the most highly paid English professor at Harvard get paid more than the most highly paid math professor at MIT in 2013?
Answers:
1N 2Y 3Y 4Y 5N 6Y 7N* 8Y 9Y 10N
* We assume we have those digits and we just subtract them.
2. Types of statistical studies: Experimental vs. Observational
Sample Study
We are trying to estimate the value of a parameter for a population.
We randomly sample from our population and take -for instance- the average daily time on a computer for our sample. That’s going to be an estimate for the population parameter.
Observational Study
We’re not trying to estimate a parameter. We’re trying to understand how two parameters in a population might move together or not.
We are curious about how the daily time spent on a computer relates to people’s blood pressure.
We will check for association or correlation, but not causation. Because there could be a confounding variable (or also called a lurking variable), that is a root variable that drives both of the variables.
Experiment
It is all about trying to establish causality.
To avoid having the confounding variable introduce error in our experiment, we take -say- 100 people and we randomly assign them to two groups. Of course we might not know all of the confounding variables out there, but the main idea behind random assignment is, not to have one group have a significant difference than the other. One of the groups will be a control group and the other one a treatment group.
We might say to the control group to spend max 30 min in front of a computer and the treatment group would spend 2 hours. Their blood pressure should be close to each other before this. And then after the experiment we check their blood pressure.
Let’s wrap up:
We do studies to gather information and draw conclusions. The type of conclusion we draw depends on the study method used:
- In an observational study, we measure or survey members of a sample without trying to affect them.
- In a controlled experiment, we assign people or things to groups and apply some treatment to one of the groups, while the other group does not receive the treatment.
3. Sampling and observational studies
To make a valid conclusion, we’ll need a representative and not skewed sample. Random sampling is one way of doing that.
Techniques for random sampling
A couple of common techniques for generating a simple random sample are:
- Random number generator
- Random digit table
Stratified random sample
However, when we are taking a simple random sample that is truly random, there is some probability that it’s not indicative of the entire population. To mitigate that, there are other techniques, like a stratified random sample.
For instance, we can stratify our population as freshmen, sophomore, juniors, and seniors for a school. Instead of just sampling 100 out of the entire pool, we sample 25 from each of those.
Clustered random sample
But still, this might not be a representation of males and females. So we can use a technique called a clustered random sample. For this, we will randomly sample our classrooms, each of which have a close or maybe an exact balance of males and females. So we know that we’re gonna get good representation. We are still sampling, but sampling from the clusters.
For instance, an airline company wants to survey its customers one day, so they randomly select 5 flights that day and survey every passenger on those flights.
Systematic random sample
In a systematic random sample, members of the population are put in some order. A starting point is selected at random, and every nth member is selected to be in the sample.
For instance, a principal takes an alphabetized list of student names and picks a random starting point. Every 20th student is selected to take a survey.
Biased samples
To avoid bias in samples, first we need to get familiar with what kind of approaches could introduce biases. Some of them are listed below.
Voluntary response sample
- Asking students to voluntarily fulfill a survey. (This has a good chance of introducing bias. Those who choose to fill out the survey might be just more skewed one way or the other.)
- A podcaster asking her listeners to visit her website and participate in the poll. (That’s not random.)
Convenience sample
- Sampling the first 100 students who show up in school, because it is convenient. (This approach can be also biased, because maybe they are the most diligent students, or maybe they have early class.)
- A podcaster decides to poll the next 100 listeners who send her fan emails.
Under coverage
- Asking 100 people from a phone book about internet privacy.
Nonresponse
- When people refuse to participate.
Response
- When people are systematically dishonest. (When they just do not want to tell the truth or not wanting to respond at all.)
Biased wording
- Suggesting something in the survey phrasing, like “Do you consider yourself lucky to get a math education that very few other people in the world have access to?”
Causality vs Correlation
We also need to understand the difference between causality and correlation.
- Correlation is when we see the events at the same time.
- When one event causes another, then it’s causality.
4. Introduction to experiment design
Let’s say that we are a drug company and we have come up with a medicine that we think will help folks with diabetes, and in particular to help reduce their hemoglobin A1c levels. Then:
Explanatory variable: whether or not one takes the pill
Response variable: thing that is affecting (~A1c level)
An experimental unit: who or what we are assigning to a treatment
Control group gets a placebo, the treatment group gets the medicine. This is called a Blind Experiment, meaning these groups don’t know what kind of pills they are getting. If we’d also not tell the experiment holders what pills they are giving to the groups, then it would be a Double-Blind Experiment. And, if people who are analyzing the data don’t know which group is the control and which one is the treatment one, then we’d call this a Triple-Blind Experiment.
We should divide the groups randomly. But there is still a possibility that we could disproportionately get random variables. Some random chance might give us a false negative, or a false positive too.
That’s why an important idea in experiments (in fact, in science generally) is that we should document the experiment well and other people should be able to replicate the experiment (and hopefully get consistent results) to reinforce the idea that our results are actually true (and not just random or not just due to some bad administration of the actual experiment).
Matched pairs experiment design
Due to the potential for imbalance in experimental groups, we might need to consider some other approaches too. A Matched pairs design can help address this issue.
The process is like this: After assigning people randomly into control and treatment groups, and measuring the results, then we change their groups and measure again.
These are just different ways to approach it. As we construct experiments, we should consider what types of things are practical to do and have the best chance at giving us real, unbiased information.
5. Scope of inference: Random sampling vs. Random assignment
Hilary wants to determine if any relationship exists between Vitamin D and blood pressure.
Scenario 1
Hilary obtains a random sample of residents from her town. She surveys those residents on whether or not they consume Vitamin D and how much Vitamin D they get. She also measures their blood pressures.
Suppose Hilary finds that among the people sampled, those who consume higher amounts of Vitamin D had significantly lower blood pressure than those who did not.
Conclusions
- Based on this study, we can safely say this result probably holds true for all residents in Hilary’s town.
Since Hilary took a random sample of residents from her town, it should be representative of the town as a whole. Her sample may not represent a larger population though.
- We can’t conclude that the difference is caused by the Vitamin D because there wasn’t a random assignment.
Scenario 2
Hilary recruits residents from her town who have physical exams scheduled in the next month with the local doctor’s office. She randomly assigns the volunteers to either a Vitamin D supplement pill or a placebo pill. Participants do not know which pill they are taking. They have their blood pressures measured before the study begins and at the end of the study.
Suppose Hilary finds that the group who took the Vitamin D supplements had a significant decrease in blood pressure, while the placebo group showed no significant change in blood pressure.
Conclusions
- We can safely say this result holds true for just the residents in Hilary’s study.
Since Hilary took volunteers who all have something in common (upcoming physical exams) and not a random sample of residents from her town, it may not be representative of the town as a whole — adults who don’t get physical exams aren’t represented in the sample.
- We can conclude that the difference is caused by the Vitamin D because of the randomized experiment design.
Note: In the real world, we can’t ethically take a random sample of people and make them participate in a study involving drugs. However, there are more advanced methods for controlling for this type of selection bias. When we rely on volunteers for testing new drugs and we see significant results, we need to be willing to assume that the volunteers are representative of the larger population. We can also repeat the study on a different group of volunteers to see if we get the same results.
Key idea: If a sample isn’t randomly selected, it may not be representative of the larger population.
Summary
Random sampling | Not random sampling | |
Random assignment | Can determine causal relationship in population. This design is relatively rare in the real world. | Can determine causal relationship in that sample only. This design is where most experiments would fit. |
No random assignment | Can detect relationships in population, but cannot determine causality. This design is where many surveys and observational studies would fit. | Can detect relationships in that sample only, but cannot determine causality. This design is where many unscientific surveys and polls would fit. |
Disclaimer: Like most of my posts, this content is intended solely for educational purposes and was created primarily for my personal reference. At times, I may rephrase original texts, and in some cases, I include materials such as graphs, equations, and datasets directly from their original sources.
I typically reference a variety of sources and update my posts whenever new or related information becomes available. For this particular post, the primary source was Khan Academy’s Statistics and Probability series.