Mean normalization (mean-centering)

I. Features:
Xnorm=Xmean(X)XmaxXmin

Where Xmean, Xmax and Xmin​ are calculated for each feature.

II. Pros:
III. Cons:
IV. Best Use Cases:

You’ll want to reach for Mean Normalization when your machine learning model is a bit "picky" about where the data starts.

V. When NOT to Use
  1. You have Sparse Data: If your data is 90% zeros (like a word-count matrix), Mean Normalization will turn all those zeros into a specific negative number. Suddenly, your computer has to remember millions of tiny numbers instead of just "zero," which can crash your program.
  2. You have Extreme Outliers: If you're measuring the wealth of people in a room and Bill Gates walks in, the "Range" becomes so huge that everyone else’s normalized wealth will look exactly the same (near zero).
  3. The Algorithm doesn't care about centering: Some models, like Decision Trees or Random Forests, don't care about the scale or the mean at all. Using this would just be extra work for no gain!
VI. Code Snippet
import numpy as np

# Example dataset
data = np.array([[10, 20], [15, 25], [30, 35], [50, 45]])

# Mean Normalization implementation
mean = np.mean(data, axis=0)
min_val = np.min(data, axis=0)
max_val = np.max(data, axis=0)
mean_normalized_data = (data - mean) / (max_val - min_val)
pd.DataFrame(mean_normalized_data)
0 1
0 -0.40625 -0.45
1 -0.28125 -0.25
2 0.09375 0.15
3 0.59375 0.55

Difference from Standardization: While both center the data at zero, Mean Normalization divides by the Range, whereas Standardization divides by the Standard Deviation.