What is a Logit?

A logit is the natural logarithm of the odds ratio. It's a transformation that maps a probability value from [0, 1] to the entire real number line (-∞, +∞).

Mathematical Definition
Logit(p)=log(p1p)=z

Where:

Analogy: Horse Betting

To understand why this transformation is useful, think about horse betting.

In horse betting, there's a commonly used term called odds. When we say the odds of horse number 5 winning are 3/8, we're actually saying that after 11 races, the horse will win 3 of them and lose 8.

Mathematically, odds are expressed as:

odds=p(x)1p(x)

The odds can take any positive value: [0,+). However, if we take the log of the odds, the range changes to (,+). This is called the logit function.

Why is this useful?
Linear models (like neural networks before the final activation) produce outputs on the entire real number line (,+). By predicting logits instead of probabilities directly, the model doesn't have to worry about constraining its output to be between 0 and 1. We can then convert the logit back to a probability using the Sigmoid function.

Deriving the Sigmoid Function from Logit

If we set the logit to a variable z (the raw output of a model), we can solve for the probability p:

z=log(p1p)ez=p1p(exponentiate both sides)ez(1p)=p(multiply both sides by (1p))ezezp=pez=p+ezpez=p(1+ez)p=ez1+ez

By dividing the numerator and denominator by ez, we get the familiar sigmoid formula:

p=ez/ez(1+ez)/ez=11/ez+1=11+ez

This shows that the Sigmoid function is the inverse of the Logit function. It converts a logit back into a probability.

Context in Machine Learning