Linear vs. Non-Linear Relationships

I. Linear Features

II. Non-Linear Features

How to Identify Them

Identifying these relationships is a critical step in Feature Selection, as some algorithms (like Linear Regression) struggle with non-linear data.

A. Visual Inspection (The "Eye Test")

Plot Linear Non-Linear
Scatter Straight trend Curve
Box Steady increase Irregular
Violin Smooth shift Uneven shift
Pair plot Diagonal cloud Curved cloud
Residual Random Pattern
Heatmap High correlation Might miss relationship
LOESS Straight smooth Curved smooth
1. Scatter Plot
2. Box Plot
3. Violin Plot
4. Pair Plot
5. Residual Plot
6. LOESS
7. Heatmap
8. Histogram Plot
9. KDE Plot

B. Statistical Metrics

You can use mathematical scores to quantify the type of relationship:

1. Pearson Correlation (r)

Calculate the Pearson correlation coefficient between each numerical feature and the target variable. Measures the strength of a linear relationship.

2. Spearman’s Rank Correlation (Non-Linear Relationships)

If Pearson’s test fails, use Spearman’s correlation, which detects monotonic (but not necessarily linear) relationships.

3. Mutual Information
4. Polynomial Regression (for Regression)