Mean Bias Error (MBE)

Definition

MBE calculates the average of the errors (not absolute errors), which tells us if the model has a systematic tendency to overpredict or underpredict. Unlike other metrics, it doesn't measure accuracy—it measures bias.

Individual Loss (Bias):

L(y,y^)=yy^

Mean Bias Error:

MBE=1ni=1n(yiyi^)

Advantages

1. Detects Systematic Bias
2. Sign Indicates Direction
3. Simple to Calculate
4. Useful for Calibration

Disadvantages

1. Errors Can Cancel Out
2. Not a Measure of Accuracy
3. Unreliable on Its Own
4. Can Be Misleading

When to Use MBE

When to Avoid MBE

Scaling and Practical Considerations

1. Does MBE Need Scaled Data?

The short answer: Technically, No. MBE is a diagnostic metric that works on any scale.
The real answer: Practically, Yes for features (to train better models), but be careful with target scaling as it changes interpretation.

2. Key Insight: MBE Measures Bias in Original Units

3. When does scaling help?

Applies to all model types that benefit from scaling

★ Target Scaling (Changes Interpretation)

Be cautious: this changes what MBE means

Without target scaling:

# MBE = -$2.50
# → Model overpredicts by $2.50 on average
# Clear, interpretable in original units ✅

With Standardization (mean=0, std=10):

# MBE = -0.25
# → Model overpredicts by 0.25 standard deviations
# Still meaningful, but requires understanding of scale ⚠️

With MinMax scaling (0-1):

# If original range is $0-$100
# MBE = -0.025
# → Model overpredicts by 0.025 in normalized units
# → That's $2.50 in original units
# Harder to interpret without reverse calculation ❌

4. Effect of Scaling on MBE

Scenario MBE Value Interpretation
Original scale -2.50 Overpredict by 2.50 units ✅ Clear
Standardized -0.25 Overpredict by 0.25 std devs ✅ Statistical meaning
MinMax (0-1) -0.025 Overpredict by 2.5% of range ⚠️ Less intuitive

5. Why Scaling Matters for MBE

6. Best Practice for MBE

Statistical Nuance: "Mean vs. Median"

  • MBE targets the Mean: If you minimize MBE, your model is trying to predict the Average value, but with a focus on bias, not error magnitude.
  • MAE targets the Median: If you minimize MAE, your model is trying to predict the Median value.

Why this matters: MBE is about systematic error (bias), not accuracy. MAE is about error size, not direction.

Python Code Example

import numpy as np
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# Load the mpg dataset
mpg = sns.load_dataset('mpg')
mpg = mpg.dropna()

# Prepare data
X = mpg[['horsepower', 'weight']].values
y = mpg['mpg'].values

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Calculate MBE manually (sklearn doesn't have MBE)
mbe = np.mean(y_test - y_pred)
print(f"Mean Bias Error (MBE): {mbe:.4f}")

# Interpret the result
if mbe > 0:
    print(f"→ Model tends to UNDERPREDICT by an average of {abs(mbe):.2f} mpg")
elif mbe < 0:
    print(f"→ Model tends to OVERPREDICT by an average of {abs(mbe):.2f} mpg")
else:
    print("→ Model has no systematic bias")

# Compare with other metrics
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"\nComparison with other metrics:")
print(f"MBE:  {mbe:.4f} (shows bias direction)")
print(f"MAE:  {mae:.4f} (shows average error magnitude)")
print(f"RMSE: {rmse:.4f} (penalizes large errors)")

# Demonstrate how errors cancel out
errors = y_test - y_pred
print(f"\nError statistics:")
print(f"Positive errors (underpredictions): {np.sum(errors > 0)}")
print(f"Negative errors (overpredictions): {np.sum(errors < 0)}")
print(f"MBE (net): {mbe:.4f}")

# Visualize bias
fig, axes = plt.subplots(1, 2, figsize=(15, 6))

# Plot 1: Residual plot
axes[0].scatter(y_pred, errors, alpha=0.6)
axes[0].axhline(y=0, color='r', linestyle='--', linewidth=2)
axes[0].axhline(y=mbe, color='g', linestyle='--', linewidth=2, label=f'MBE = {mbe:.2f}')
axes[0].set_xlabel('Predicted MPG')
axes[0].set_ylabel('Residual (Actual - Predicted)')
axes[0].set_title('Residual Plot - Visualizing Bias')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Plot 2: Error distribution
axes[1].hist(errors, bins=30, alpha=0.7, edgecolor='black')
axes[1].axvline(0, color='r', linestyle='--', linewidth=2, label='Zero bias')
axes[1].axvline(mbe, color='g', linestyle='--', linewidth=2, label=f'MBE = {mbe:.2f}')
axes[1].set_xlabel('Error (Actual - Predicted)')
axes[1].set_ylabel('Frequency')
axes[1].set_title('Distribution of Errors')
axes[1].legend()

plt.tight_layout()
plt.show()

Output

Mean Bias Error (MBE): -1.0425
→ Model tends to OVERPREDICT by an average of 1.04 mpg

Comparison with other metrics:
MBE:  -1.0425 (shows bias direction)
MAE:  3.5057 (shows average error magnitude)
RMSE: 4.2180 (penalizes large errors)

Error statistics:
Positive errors (underpredictions): 26
Negative errors (overpredictions): 53
MBE (net): -1.0425

ML_AI/images/mb-1.png