What is a Logit?
A logit is the natural logarithm of the odds ratio. It's a transformation that maps a probability value from [0, 1] to the entire real number line (-∞, +∞).
Mathematical Definition
Where:
- Probability
p: The likelihood of an event happening (e.g., 0.8 for an 80% chance), where - Odds: The ratio of the probability of an event happening to it not happening
- Logit (Log-odds): The natural logarithm of the odds, where
Analogy: Horse Betting
To understand why this transformation is useful, think about horse betting.
In horse betting, there's a commonly used term called odds. When we say the odds of horse number 5 winning are 3/8, we're actually saying that after 11 races, the horse will win 3 of them and lose 8.
Mathematically, odds are expressed as:
The odds can take any positive value:
❓Why is this useful?
Linear models (like neural networks before the final activation) produce outputs on the entire real number line. By predicting logits instead of probabilities directly, the model doesn't have to worry about constraining its output to be between 0 and 1. We can then convert the logit back to a probability using the Sigmoid function.
Deriving the Sigmoid Function from Logit
If we set the logit to a variable
By dividing the numerator and denominator by
This shows that the Sigmoid function is the inverse of the Logit function. It converts a logit back into a probability.
Context in Machine Learning
-
In Neural Networks: The input
is the weighted sum of the last layer, usually represented as: -
In Softmax: The input is a vector of logits for multiple classes: