Shapiro-Wilk Test

Most powerful test for normality, especially for smaller sample sizes (n &lt 5000)
Tests the null hypothesis that data was drawn from a normal distribution
W statistic ranges from 0 to 1 (closer to 1 = more normal)

Interpretation

p-value > 0.05: Fail to reject null hypothesis → Data is normally distributed ✓
p-value ≤ 0.05: Reject null hypothesis → Data is NOT normally distributed ✗
W close to 1: Data resembles normal distribution
W far from 1: Data deviates from normal distribution

Python Example

# Example: Test for Normality using Shapiro-Wilk Test (scipy.stats.shapiro)
from scipy.stats import shapiro
import numpy as np
import matplotlib.pyplot as plt

# Generate sample data: normal and non-normal
normal_data = np.random.normal(loc=0, scale=1, size=1000)
non_normal_data = np.random.exponential(scale=2, size=1000)

# Plot histograms for visual inspection
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].hist(normal_data, bins=30, color='skyblue', edgecolor='black')
axes[0].set_title('Normal Data Histogram')
axes[1].hist(non_normal_data, bins=30, color='salmon', edgecolor='black')
axes[1].set_title('Non-Normal Data Histogram')
plt.tight_layout()
plt.show()

# Shapiro-Wilk test for normality
stat_norm, p_norm = shapiro(normal_data)
stat_non_norm, p_non_norm = shapiro(non_normal_data)

print(f"Normal Data: Statistic={stat_norm:.4f}, p-value={p_norm:.4f}")
print(f"Non-Normal Data: Statistic={stat_non_norm:.4f}, p-value={p_non_norm:.4f}")

if p_norm > 0.05:
    print("Normal Data: Likely Gaussian (fail to reject H0)")
else:
    print("Normal Data: Not Gaussian (reject H0)")

if p_non_norm > 0.05:
    print("Non-Normal Data: Likely Gaussian (fail to reject H0)")
else:
    print("Non-Normal Data: Not Gaussian (reject H0)")

Output

Normal Data: Statistic=0.9986, p-value=0.6078
Non-Normal Data: Statistic=0.7950, p-value=0.0000
Normal Data: Likely Gaussian (fail to reject H0)
Non-Normal Data: Not Gaussian (reject H0)