LOESS Plot (Locally Weighted Scatterplot Smoothing)

Purpose

Fit a smooth non-parametric curve to data to reveal underlying trends and patterns without assuming a specific functional form.

Analysis Type

Bivariate

What to Look For

1. Linearity
2. Trend Direction
3. Local Behavior
4. Linearity Assessment
5. Outlier Impact

Code Example

import statsmodels.api as sm

# Scatter plot with LOWESS curve
x = df['x_variable']
y = df['y_variable']

# Calculate LOWESS
lowess_result = sm.nonparametric.lowess(y, x, frac=0.3)

# Plot
plt.scatter(x, y, alpha=0.5, label='Data')
plt.plot(lowess_result[:, 0], lowess_result[:, 1], color='red', linewidth=2, label='LOWESS')
plt.title("LOWESS Smoothing")
plt.xlabel("X Variable")
plt.ylabel("Y Variable")
plt.legend()
plt.show()

# Seaborn version (easier)
sns.regplot(x='x_variable', y='y_variable', data=df, 
            lowess=True, scatter_kws={'alpha':0.5},
            line_kws={'color':'red', 'linewidth':2})
plt.title("Scatter Plot with LOWESS Curve")
plt.show()
Pro Tip

The frac parameter (default 0.3) controls smoothness: lower values (0.1-0.2) create more wiggled curves that follow data closely, higher values (0.4-0.6) create smoother curves. Use frac=0.2 for large datasets and frac=0.4 for small datasets. Compare LOWESS curve to a straight line - if they're similar, use linear regression; if different, you need non-linear modeling. Create both: sns.regplot(x, y, lowess=True) and sns.regplot(x, y, lowess=False) side-by-side to assess linearity.

ML_AI/_feature_engineering/images/loess-1.png

Documentation