Anderson-Darling Test

Interpretation

Python Example

# Example: Test for Normality using Anderson-Darling Test (scipy.stats.anderson)
from scipy.stats import anderson
import numpy as np
import matplotlib.pyplot as plt

# Generate sample data: normal and non-normal
data_normal = np.random.normal(loc=0, scale=1, size=1000)
data_non_normal = np.random.exponential(scale=2, size=1000)

# Plot histograms for visual inspection
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].hist(data_normal, bins=30, color='skyblue', edgecolor='black')
axes[0].set_title('Normal Data Histogram')
axes[1].hist(data_non_normal, bins=30, color='salmon', edgecolor='black')
axes[1].set_title('Non-Normal Data Histogram')
plt.tight_layout()
plt.show()

# Anderson-Darling test for normality
result_norm = anderson(data_normal, dist='norm')
result_non_norm = anderson(data_non_normal, dist='norm')

print('Normal Data:')
print(f' Anderson-Darling Statistic: {result_norm.statistic:.4f}')

for sl, cv in zip(result_norm.significance_level, result_norm.critical_values):
	if result_norm.statistic < cv:
		print(f' {sl}%: {cv:.3f} - Data looks normal at {sl}% significance level ✓')
	else:
		print(f' {sl}%: {cv:.3f} - Data does NOT look normal at {sl}% significance level ✗')

print('\nNon-Normal Data:')
print(f' Anderson-Darling Statistic: {result_non_norm.statistic:.4f}')

for sl, cv in zip(result_non_norm.significance_level, result_non_norm.critical_values):
	if result_non_norm.statistic < cv:
		print(f' {sl}%: {cv:.3f} - Data looks normal at {sl}% significance level ✓')
	else:
		print(f' {sl}%: {cv:.3f} - Data does NOT look normal at {sl}% significance level ✗')

Output
ML_AI/_feature_engineering/images/anderson-1.png
Normal Data:
Anderson-Darling Statistic: 0.3669
15.0%: 0.574 - Data looks normal at 15.0% significance level ✓
10.0%: 0.653 - Data looks normal at 10.0% significance level ✓
5.0%: 0.784 - Data looks normal at 5.0% significance level ✓
2.5%: 0.914 - Data looks normal at 2.5% significance level ✓
1.0%: 1.088 - Data looks normal at 1.0% significance level ✓

Non-Normal Data:
Anderson-Darling Statistic: 45.7276
15.0%: 0.574 - Data does NOT look normal at 15.0% significance level ✗
10.0%: 0.653 - Data does NOT look normal at 10.0% significance level ✗
5.0%: 0.784 - Data does NOT look normal at 5.0% significance level ✗
2.5%: 0.914 - Data does NOT look normal at 2.5% significance level ✗
1.0%: 1.088 - Data does NOT look normal at 1.0% significance level ✗