P-Value

I. What is a Hypothesis?

A statistical hypothesis is a a statement, assumption, or claim about a population that we test through statistical analysis.
Examples:

“The average height of students is 66 inches.”
“This medicine reduces fever.”
“This coin is fair.”
In statistics, we test claims using data.

II. The Null Hypothesis ( $H_{0}$ )

In any experiment, we start with the default assumption that nothing interesting is happening. This is the Null Hypothesis ( $H_{0}$ ).

The Claim: A new fertilizer makes plants grow taller.
The Null Hypothesis ( $H_{0}$ ): The fertilizer has no effect. Any difference in height is just due to random chance.

The Null Hypothesis (H₀) is the default assumption.
It represents:

No effect
No difference
No change
Status quo
It is what we assume to be true unless strong evidence suggests otherwise.
Example: The average weight loss after taking the supplement is ≤ 5 pounds per month (i.e., the supplement has no significant effect).

III. Alternative Hypothesis ( $H_{a}$ )

We also have the Alternative Hypothesis ( $H_{a}$ ),
- which is the opposite claim, suggesting that the Null Hypothesis ( $H_{0}$ ) is incorrect.

which is what you’re actually hoping to prove or
what we are trying to find evidence for
e.g The fertilizer does make plants grow taller.

The Alternative Hypothesis is what we are trying to find evidence for.
It represents:

Effect
Difference
Relationship
Example: The average weight loss after taking the supplement is > 5 pounds per month (i.e., the supplement is effective). This can be based on to be tests or research to collect evidence.

There are three types:

Two-tailed: $H_{a} : μ \neq μ_{0}$
Right-tailed: $H_{a} : μ > μ_{0}$
Left-tailed: $H_{a} : μ < μ_{0}$

IV. How do we perform Hypothesis Testing?

Lets Understand below 3 connected concepts

★ Level of Confidence

The confidence level represents how certain we are that our sample data reflects reality despite natural variations.

Example: If we use a 95% confidence level, it means that if we repeat the study 100 times, then in 95 out of 100 trials, the average weight loss will be ≥ 5 pounds per month.
A higher confidence level (e.g., 99%) means we are more confident, but it also makes it harder to reject the null hypothesis.

Confidence Level helps answer: How confident are we in our claim?

★ The "Alpha" ( $α$ ) Threshold ➛ "Significance Level"

In Advanced Stats, we usually set a "line in the sand" called the significance level ( $α$ ), typically 0.05.
The significance level ( $α$ ) is the probability of rejecting the Null Hypothesis when it is actually true. It sets the threshold for statistical significance.
We compare the p-value to a significance level ( $α$ )

If $p-value \leq α$ , we say the result is statistically significant, we reject $H_{0}$ .
If $p-value > α$ , we fail to reject $H_{0}$ , meaning we don’t have enough evidence to support $H_{a}$ .

Here: $0.0059 < 0.05$ , so we reject H₀.
Conclusion: There is strong evidence the coin is biased.

Formula: Significance Level ( $α$ ) = 1 — Confidence Level
Example: If the Confidence Level = 95%, then α = 0.05 (5%).

★ What Is the P-Value?

The P-value is the probability of getting results at least as extreme as the ones we observed, assuming that the Null Hypothesis is actually true.

p -value = P (data as extreme as observed ∣ H_{0} true)

Important: It is a probability calculated under the assumption that H₀ is true.

★ Decision Rule / Interpreting the Number

The P-value is a number between 0 and 1. It tells you how "weird" your data is under the assumption that nothing is happening.

A low P-value (≤ $α$ ) suggests that the observed data is unlikely under $H_{0}$ , so we reject $H_{0}$ .
- (e.g., 0.01): "If the $H_{0}$ is true, there’s only a 1% chance I’d see results this extreme." Conclusion: This is too weird to be a coincidence. We reject the Null.
A high P-value (> $α$ ) suggests that the observed data is likely under $H_{0}$ , so we fail to reject $H_{0}$ .
- (e.g., 0.50): "If the $H_{0}$ is true, there’s a 50% chance I’d see these results anyway." Conclusion: This isn't weird. We fail to reject the Null.

Example: If we obtain a P-value of 0.03 and our significance level α = 0.05, then 0.03 < 0.05, so we reject the null hypothesis.

V. Diagrammatic representation

VI. What the P-Value DOES NOT Mean?

The p-value "is NOT":

❌ The probability that the Null Hypothesis is true.
- It’s a conditional probability.
- It assumes the Null is true and then looks at the data.
- It doesn't tell you the "truth" of the world; it just tells you how much the data disagrees with the "nothing is happening" assumption.
❌ The probability we are wrong.
❌ The probability the alternative is true.
It "is":
✅ The probability of the observed data (or more extreme), assuming $H_{0}$ is true.

The Bigger Picture

Hypothesis testing is about: $signal vs noise$
The null hypothesis represents $n o i s e$ .
The p-value measures:

How incompatible the data is with pure noise.

Small p-value → data unlikely under noise → evidence of signal.

VII. Hypothesis Testing Example

Problem Statement:

A university claims that the average starting salary of its graduates is at least $70,000 per year. A job market analyst believes the true average starting salary is less than $70,000. To test this claim, the analyst collects a random sample of 200 graduates and records their starting salaries.

After analyzing the data, the results are:

Sample Mean ( $\bar{x}$ ) = $68,500
Population Standard Deviation ( $σ$ ) = $8,000
Sample Size (n) = 200
Significance Level ( $α$ ) = 0.05

Step 1: Define Hypothesis

Null Hypothesis ( $H_{0}$ ): The average starting salary is at least $70,000.

H_{0} : μ = 70, 000

Alternate Hypothesis ( $H_{a}$ ): The average starting salary is less than $70,000.

H_{a} ​ : μ < 70, 000

Step 2: Calculate Z-Score (Test Statistic)

Since the sample size is large ( $n \geq 30$ ), we use a Z-test

The formula for the Z-score is:

Z = \frac{\bar{x} - μ}{\frac{σ}{\sqrt{n}}}

Substituting the given values:

\begin{aligned} Z & = \frac{68500 - 70000}{\frac{8000}{\sqrt{200}}} \\ Z & = - 2.65 \end{aligned}

Step 3: Find P-Value

Using a Z-table or calculator, the p-value for Z = -2.65 is:

P (Z < - 2.65) \approx 0.004

Step 4: Compare P-Value with Alpha (α)

Given $α = 0.05$
$P = 0.004 i s l e s s t h a n α = 0.05$
Since p-value < $α$ , we reject the null hypothesis.

Step 5: Conclusion

There is strong statistical evidence that the university’s graduates earn less than $70,000 on average. The analyst rejects Null Hypothesis $(H_{0})$ and concludes that the true average salary is likely lower than $70,000.

Final Summary

Hypothesis: Testing if the average salary is less than $70,000.
Null Hypothesis (H₀): The average starting salary of its graduates is at least $70,000
Alternate Hypothesis (Hₐ): The average starting salary of its graduates is less than $70,000
Z-score: −2.65
P-value: 0.004
Comparison: Since 0.004<0.05, we reject H₀.
Conclusion: The university’s claim is likely false, and the true average salary is lower than $70,000.

Relation between

R^{2}

and

p - v a l u e

R-square value tells you how much variation is explained by your model. So 0.1 R-square means that your model explains 10% of variation within the data. The greater R-square the better the model. Whereas p-value tells you about the F statistic hypothesis testing of the “fit of the intercept-only model and your model are equal”. So if the p-value is less than the significance level (usually 0.05) then your model fits the data well.

There are 4 scenarios

1. Low $R^{2}$ and Low p-value (p-value <= 0.05)

It means that your model doesn’t explain much of variation of the data but it is significant (better than not having a model)

2. Low R-square and High p-value (p-value > 0.05)

It means that your model doesn’t explain much of variation of the data and it is not significant (worst scenario)

3. High R-square and Low p-value

It means your model explains a lot of variation within the data and is significant (best scenario)

4. High R-square and High p-value

It means that your model explains a lot of variation within the data but is not significant (model is worthless)

Tutorial Videos