Quantile Transformer

A Quantile Transformer is a powerful non-parametric preprocessing tool that transforms features to follow a specific distribution (either Uniform or Normal). Unlike the Power Transformer, which relies on a mathematical power function, the Quantile Transformer uses the rank of each data point to reshape the distribution.

I. Features

II. Best Use Cases

III. When NOT to Use It

IV. Pros

V. Cons

VI. Sample Code

import pandas as pd
from sklearn.datasets import load_wine
from sklearn.preprocessing import QuantileTransformer
import matplotlib.pyplot as plt
import seaborn as sns

# Load a public dataset
data = load_wine()
df = pd.DataFrame(data.data, columns=data.feature_names)
feature = df["alcohol"].to_numpy().reshape(-1, 1)

# Apply Quantile Transformer
qt = QuantileTransformer(output_distribution="normal", random_state=0)
transformed = qt.fit_transform(feature)

# Create subplots
fig, axes = plt.subplots(1, 4, figsize=(18, 4))

# Original Data KDE Plot
sns.kdeplot(feature.flatten(), ax=axes[0])
axes[0].set_title('Original Data PDF')

# Original Data QQ Plot
stats.probplot(feature.flatten(), dist='norm', plot=axes[1])
axes[1].set_title('QQ Plot: Original Data')

# QuantileTransformer Data KDE Plot
sns.kdeplot(transformed.flatten(), ax=axes[2])
axes[2].set_title('QuantileTransformer Data PDF')

# QuantileTransformer Data QQ Plot
stats.probplot(transformed.flatten(), dist='norm', plot=axes[3])
axes[3].set_title('QQ Plot: QuantileTransformer Data')

# Adjust layout and display
plt.tight_layout()
plt.show()

log-3.png