Types of Voting
1. Hard Voting (Majority Voting)
For Classification Only
Each model predicts a class label, and the final prediction is the class that receives the most votes.
i. Example
Binary Classification (Fraud Detection):
- Model 1 (Random Forest): Predicts "Fraud"
- Model 2 (Logistic Regression): Predicts "Not Fraud"
- Model 3 (SVM): Predicts "Fraud"
- Model 4 (Neural Network): Predicts "Fraud"
Final Prediction: "Fraud" (3 votes vs. 1 vote)
Multi-Class Classification (Animal Recognition):
- Model 1: Predicts "Cat"
- Model 2: Predicts "Dog"
- Model 3: Predicts "Cat"
- Model 4: Predicts "Cat"
- Model 5: Predicts "Bird"
Final Prediction: "Cat" (3 votes out of 5)
ii. Tie Breaking
When there's a tie:
- Random selection: Pick randomly among tied classes
- Order-based: First model in the list wins
- Confidence-based: Use soft voting as tiebreaker
iii. Characteristics
Advantages:
- Simple and intuitive
- Fast (just counting votes)
- Works with any classifier (doesn't need probability estimates)
- Robust to individual model errors
Disadvantages:
- Treats all models equally (can't weight by confidence)
- Ignores prediction confidence (90% sure vs. 51% sure treated the same)
- Less flexible than soft voting
- Requires all models to predict the same class labels
2. Soft Voting (Weighted Averaging of Probabilities)
For Classification Only
Each model outputs class probabilities, and the final prediction is based on the averaged probabilities across all models.
How It Works?
Given
For class
Where
Final prediction:
Choose the class with the highest averaged probability.
i. Example
Binary Classification (Is this email spam?):
| Model | P(Spam) | P(Not Spam) |
|---|---|---|
| Random Forest | 0.8 | 0.2 |
| Logistic Regression | 0.6 | 0.4 |
| SVM | 0.9 | 0.1 |
Averaged Probabilities:
- P(Spam) = (0.8 + 0.6 + 0.9) / 3 = 0.767
- P(Not Spam) = (0.2 + 0.4 + 0.1) / 3 = 0.233
Final Prediction: Spam (higher averaged probability)
Multi-Class Example (Digit Recognition):
| Model | P(0) | P(1) | P(2) | P(3) | ... |
|---|---|---|---|---|---|
| CNN-1 | 0.1 | 0.7 | 0.1 | 0.05 | ... |
| CNN-2 | 0.15 | 0.6 | 0.15 | 0.05 | ... |
| CNN-3 | 0.2 | 0.5 | 0.2 | 0.05 | ... |
| Average | 0.15 | 0.6 | 0.15 | 0.05 | ... |
Final Prediction: Class 1 (highest averaged probability)
ii. Weighted Soft Voting
Assign different weights to models based on their reliability:
Where
Example:
# Pseudocode
# Model 1 has 80% accuracy → weight = 0.8
# Model 2 has 70% accuracy → weight = 0.7
# Model 3 has 90% accuracy → weight = 0.9
weights = [0.8, 0.7, 0.9]
weighted_probs = np.average(all_probs, axis=0, weights=weights)
iii. Characteristics
Advantages:
- Accounts for prediction confidence
- Generally higher accuracy than hard voting
- Can weight models by their reliability
- Smooths probability estimates
Disadvantages:
- Requires models that output calibrated probabilities
- Slightly more complex than hard voting
- Assumes probability estimates are meaningful
- Not all models provide good probability estimates
3. Averaging
For Regression Only
Each model predicts a continuous value, and the final prediction is the average (or weighted average) of all predictions.
Simple Averaging
Weighted Averaging
i. Example
House Price Prediction:
- Model 1 (Linear Regression): $350,000
- Model 2 (Random Forest): $370,000
- Model 3 (XGBoost): $365,000
Simple Average: ($350k + $370k + $365k) / 3 = $361,667
Weighted Average (if Model 3 is most reliable):
- Weights: [0.2, 0.3, 0.5]
- Prediction: 0.2×$350k + 0.3×$370k + 0.5×$365k = $363,500
ii. Alternative Aggregation Methods
Median (Robust to outliers):
Trimmed Mean (Remove extremes):
- Sort predictions
- Remove top and bottom 10-20%
- Average remaining predictions
iii. Characteristics
Advantages:
- Reduces variance (smooths individual predictions)
- Simple and interpretable
- Robust to individual model errors
- Median provides outlier resistance
Disadvantages:
- Doesn't reduce bias (average of biased models is still biased)
- Treats all models equally by default
- May not capture complex interactions
Advanced Voting Techniques
1. Dynamic Voting
Adjust weights based on input characteristics:
# Pseudocode
def dynamic_voting(x):
if feature_A(x) > threshold:
# Model 1 is better for high feature_A
weights = [2, 1, 1]
else:
# Model 2 is better for low feature_A
weights = [1, 2, 1]
return weighted_vote(x, weights)
2. Confidence-Based Voting
Only use predictions above certain confidence:
# Pseudocode
threshold = 0.8
valid_predictions = []
for model in models:
prob = model.predict_proba(x)
if max(prob) > threshold:
valid_predictions.append(model.predict(x))
final = majority_vote(valid_predictions)
3. Hierarchical Voting
Multi-stage voting:
# Pseudocode
# Stage 1: Fast models vote
if unanimous_agreement(fast_models):
return quick_prediction
else:
# Stage 2: Add slow but accurate models
return full_ensemble_prediction
4. Selective Voting
Choose different subsets of models for different samples:
# Pseudocode
# For easy examples, use 3 models
# For hard examples, use all 7 models
difficulty_score = estimate_difficulty(x)
if difficulty_score < threshold:
models_to_use = [model1, model2, model3]
else:
models_to_use = all_models
return vote(x, models_to_use)
5. Voting with Rejection Option
Reject predictions when models disagree strongly:
# Pseudocode
votes = [model.predict(x) for model in models]
agreement = max(Counter(votes).values()) / len(votes)
if agreement < 0.6:
return "UNCERTAIN - MANUAL REVIEW"
else:
return majority_vote(votes)