Root Mean Squared Error (RMSE)

Definition

RMSE is simply the square root of Mean Squared Error (MSE). It brings the error back to the original units of the target variable, making it much easier to interpret.

Formula:

RMSE=1ni=1n(yiyi^)2

Advantages

All the advantages of MSE are also in RMSE, but in addition to that one of the disadvantages is addressed in here to its advantage.

1. Same units as the target:
2. Smooth and differentiable
3. Penalizes large errors heavily
4. Efficient convergence
4. Mathematical convenience

Disadvantages

RMSE (Root Mean Square Error) is derived directly from MSE, it inherits the same mathematical disadvantages.

1. High Sensitivity to Outliers
2. The Scale & Comparison Problem
3. The Normal Distribution Assumption:
4. Non-Linearity of Error
5. Gradient Issues Near Zero
6. Sum of Errors vs. Square Root

When to Use RMSE

When to Avoid RMSE

Scaling and Practical Considerations

Python Code Example

import numpy as np
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

# Load the mpg dataset
mpg = sns.load_dataset('mpg')
mpg = mpg.dropna()  # Remove missing values
print("Dataset shape:", mpg.shape)

# Predict mpg based on horsepower and weight
X = mpg[['horsepower', 'weight']].values
y = mpg['mpg'].values

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Calculate RMSE
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"\nRoot Mean Squared Error (RMSE): {rmse:.4f} mpg")
print(f"This means, on average, our predictions are off by about {rmse:.2f} miles per gallon")

# Compare with MSE
mse = mean_squared_error(y_test, y_pred)
print(f"\nFor comparison:")
print(f"MSE: {mse:.4f} (squared mpg - hard to interpret)")
print(f"RMSE: {rmse:.4f} (mpg - easy to interpret)")

Output

Dataset shape: (392, 9)

Root Mean Squared Error (RMSE): 4.2180 mpg
This means, on average, our predictions are off by about 4.22 miles per gallon

For comparison:
MSE: 17.7918 (squared mpg - hard to interpret)
RMSE: 4.2180 (mpg - easy to interpret)