P-Value
I. What is a Hypothesis?
A statistical hypothesis is a a statement, assumption, or claim about a population that we test through statistical analysis.
Examples:
- “The average height of students is 66 inches.”
- “This medicine reduces fever.”
- “This coin is fair.”
In statistics, we test claims using data.
II. The Null Hypothesis ( )
In any experiment, we start with the default assumption that nothing interesting is happening. This is the Null Hypothesis (
- The Claim: A new fertilizer makes plants grow taller.
- The Null Hypothesis (
): The fertilizer has no effect. Any difference in height is just due to random chance.
The Null Hypothesis (H₀) is the default assumption.
It represents:
- No effect
- No difference
- No change
- Status quo
It is what we assume to be true unless strong evidence suggests otherwise.
Example: The average weight loss after taking the supplement is ≤ 5 pounds per month (i.e., the supplement has no significant effect).
III. Alternative Hypothesis ( )
We also have the Alternative Hypothesis (
- which is the opposite claim, suggesting that the Null Hypothesis (
- which is what you’re actually hoping to prove or
- what we are trying to find evidence for
e.g The fertilizer does make plants grow taller.
The Alternative Hypothesis is what we are trying to find evidence for.
It represents:
- Effect
- Difference
- Relationship
Example: The average weight loss after taking the supplement is > 5 pounds per month (i.e., the supplement is effective). This can be based on to be tests or research to collect evidence.
There are three types:
- Two-tailed:
- Right-tailed:
- Left-tailed:
IV. How do we perform Hypothesis Testing?
Lets Understand below 3 connected concepts
★ Level of Confidence
The confidence level represents how certain we are that our sample data reflects reality despite natural variations.
- Example: If we use a 95% confidence level, it means that if we repeat the study 100 times, then in 95 out of 100 trials, the average weight loss will be ≥ 5 pounds per month.
- A higher confidence level (e.g., 99%) means we are more confident, but it also makes it harder to reject the null hypothesis.
Confidence Level helps answer: How confident are we in our claim?
★ The "Alpha" ( ) Threshold ➛ "Significance Level"
In Advanced Stats, we usually set a "line in the sand" called the significance level (
The significance level (
We compare the p-value to a significance level (
- If
, we say the result is statistically significant, we reject . - If
, we fail to reject , meaning we don’t have enough evidence to support .
Here:
Conclusion: There is strong evidence the coin is biased.
Formula: Significance Level (
) = 1 — Confidence Level
Example: If the Confidence Level = 95%, then α = 0.05 (5%).
★ What Is the P-Value?
The P-value is the probability of getting results at least as extreme as the ones we observed, assuming that the Null Hypothesis is actually true.
Important: It is a probability calculated under the assumption that H₀ is true.
★ Decision Rule / Interpreting the Number
The P-value is a number between 0 and 1. It tells you how "weird" your data is under the assumption that nothing is happening.
- A low P-value (≤
) suggests that the observed data is unlikely under , so we reject . - (e.g., 0.01): "If the
is true, there’s only a 1% chance I’d see results this extreme." Conclusion: This is too weird to be a coincidence. We reject the Null.
- (e.g., 0.01): "If the
- A high P-value (>
) suggests that the observed data is likely under , so we fail to reject . - (e.g., 0.50): "If the
is true, there’s a 50% chance I’d see these results anyway." Conclusion: This isn't weird. We fail to reject the Null.
- (e.g., 0.50): "If the
Example: If we obtain a P-value of 0.03 and our significance level α = 0.05, then 0.03 < 0.05, so we reject the null hypothesis.
V. Diagrammatic representation
<img src="Learning/Stats/Pictures/hypothesis_1.png" height="400", width="700">
VI. What the P-Value DOES NOT Mean?
The p-value "is NOT":
- ❌ The probability that the Null Hypothesis is true.
- It’s a conditional probability.
- It assumes the Null is true and then looks at the data.
- It doesn't tell you the "truth" of the world; it just tells you how much the data disagrees with the "nothing is happening" assumption.
- ❌ The probability we are wrong.
- ❌ The probability the alternative is true.
It "is": - ✅ The probability of the observed data (or more extreme), assuming
is true.
The Bigger Picture
- Hypothesis testing is about:
- The null hypothesis represents
. - The p-value measures:
How incompatible the data is with pure noise.
- Small p-value → data unlikely under noise → evidence of signal.
VII. Hypothesis Testing Example
Problem Statement:
A university claims that the average starting salary of its graduates is at least $70,000 per year. A job market analyst believes the true average starting salary is less than $70,000. To test this claim, the analyst collects a random sample of 200 graduates and records their starting salaries.
After analyzing the data, the results are:
- Sample Mean (
) = $68,500 - Population Standard Deviation (
) = $8,000 - Sample Size (n) = 200
- Significance Level (
) = 0.05
Step 1: Define Hypothesis
- Null Hypothesis (
): The average starting salary is at least $70,000.
- Alternate Hypothesis (
): The average starting salary is less than $70,000.
Step 2: Calculate Z-Score (Test Statistic)
Since the sample size is large (
The formula for the Z-score is:
Substituting the given values:
Step 3: Find P-Value
Using a Z-table or calculator, the p-value for Z = -2.65 is:
Step 4: Compare P-Value with Alpha (α)
- Given
Since p-value <, we reject the null hypothesis.
Step 5: Conclusion
There is strong statistical evidence that the university’s graduates earn less than $70,000 on average. The analyst rejects Null Hypothesis
Final Summary
- Hypothesis: Testing if the average salary is less than $70,000.
- Null Hypothesis (H₀): The average starting salary of its graduates is at least $70,000
- Alternate Hypothesis (Hₐ): The average starting salary of its graduates is less than $70,000
- Z-score: −2.65
- P-value: 0.004
- Comparison: Since 0.004<0.05, we reject H₀.
- Conclusion: The university’s claim is likely false, and the true average salary is lower than $70,000.
- R-square value tells you how much variation is explained by your model. So 0.1 R-square means that your model explains 10% of variation within the data. The greater R-square the better the model. Whereas p-value tells you about the F statistic hypothesis testing of the “fit of the intercept-only model and your model are equal”. So if the p-value is less than the significance level (usually 0.05) then your model fits the data well.
There are 4 scenarios
1. Low and Low p-value (p-value <= 0.05)
- It means that your model doesn’t explain much of variation of the data but it is significant (better than not having a model)
2. Low R-square and High p-value (p-value > 0.05)
- It means that your model doesn’t explain much of variation of the data and it is not significant (worst scenario)
3. High R-square and Low p-value
- It means your model explains a lot of variation within the data and is significant (best scenario)
4. High R-square and High p-value
- It means that your model explains a lot of variation within the data but is not significant (model is worthless)
- https://www.youtube.com/watch?v=KLnGOL_AUgA
- https://www.zstatistics.com/videos#/hypothesis-testing-1
- https://www.youtube.com/@MathAndScience/playlists
- https://www.youtube.com/watch?v=e6HsIWQJjdM
- https://www.youtube.com/watch?v=80YzzIm8NK8