📊 Odds, Log-Odds, and Odds Ratio

Part I: Foundations

1. Probability vs Odds

Key Question

Why do statisticians use "odds" when probability seems simpler and more intuitive?

This section answers that question by building from basic definitions to the motivation for using odds.

1.1 What is Probability?

Definition: Probability measures the likelihood of an event occurring as a fraction of all possible outcomes.

P(event)=Number of favorable outcomesTotal number of all possible outcomes

Properties:

Example: Rolling a die

1.2 What are Odds?

Definition: Odds compare the number of times an event occurs to the number of times it does NOT occur.

Odds(event)=Number of times event occursNumber of times event does NOT occur

Properties:

Example: Sports Betting

"The odds in favor of my team winning are 5 to 3"

This means:

1.3 Key Difference: Probability vs Odds

Aspect Probability Odds
Definition Favorable / Total Favorable / Unfavorable
Range [0,1] [0,)
Example (Win 5, Lose 3) 58=0.625 53=1.67
Interpretation "62.5% chance of winning" "5 to 3 odds in favor"
Use Case General understanding Betting, logistic regression
Critical Distinction

🚫 Odds ≠ Probability

  • Probability = count of eventcount of ALL events (denominator includes event)
  • Odds = count of eventcount of NOT event (denominator excludes event)

1.4 Converting Between Probability and Odds

From Probability to Odds:

Odds=p1p

where p is the probability.

Example: Sports Betting (Contd)

This proves ➛ Odds=p1p

From Odds to Probability:

p=Odds1+Odds

Example Conversions:

Probability p Odds Calculation Odds Value Interpretation
0.5 0.510.5=0.50.5 1 Even odds (1:1)
0.75 0.7510.75=0.750.25 3 3 to 1 in favor
0.25 0.2510.25=0.250.75 0.33 1 to 3 against
0.8 0.810.8=0.80.2 4 4 to 1 in favor
0.2 0.210.2=0.20.8 0.25 1 to 4 against

Verification Example:

1.5 Why Use Odds Instead of Probability?

Problem with Probability: Probabilities are bounded between 0 and 1, which creates asymmetry:

Advantage of Odds: Odds are symmetric around 1:

However, Odds Still Have Asymmetry:

Solution: Use Log-Odds to achieve perfect symmetry! � See Section 2


2. Log-Odds (Logit Function)

2.1 Definition

Log-Odds (also called logit) is the natural logarithm of the odds.

Log-Odds=log(Odds)=log(p1p)

This is called the logit function: logit(p)=log(p1p)

2.2 Why Log-Odds? The Symmetry Solution

Taking the logarithm of odds creates perfect symmetry:

Example:

Key Insight: The distance from the origin (0) is now exactly the same (1.79) for both scenarios! This symmetry is crucial for statistical modeling.

2.3 Properties of Log-Odds

Property Description
Range (,+) (unbounded in both directions)
Neutral Point 0 (when Odds = 1, meaning p=0.5)
Symmetry log(Odds1)=log(Odds2) when odds are reciprocals
Additivity Log-odds differences correspond to odds ratios
Linearity Perfect for linear modeling (logistic regression)

Explain "Symmetry" of Log-Odds

Property: Log-odds differences correspond to multiplication (odds ratios).

This property means that switching your perspective from "winning" to "losing" changes the sign of the log-odds, but the absolute value stays exactly the same.

Explain "Additivity" of Log-Odds

Property: Log-odds differences correspond to multiplication (odds ratios).

Logarithms turn multiplication into addition, and division into subtraction. In statistics, when you want to compare the odds of two different groups, you look at the Odds Ratio. Because of additivity, subtracting two log-odds values is mathematically identical to taking the logarithm of their odds ratio.

Let's introduce a second team to see this in action:

OddsA=53Log-OddsA=log(53)=0.2218 OddsB=71Log-OddsB=log(7)=0.8450

If we want to know how much better Team B is compared to Team A, we can do it two ways:

  1. Using Odds (Division): Find the Odds Ratio.
Odds Ratio=OddsBOddsA=753=4.2

Team B's odds are 4.2 times higher than Team A's odds.

  1. Using Log-odds (Subtraction): Subtract Team A's log-odds from Team B's log-odds.
Log(Odds Ratio)=Log(OddsBOddsA)=Log-OddsBLog-OddsALog’s Quotient Rule=0.84500.2218=0.6232Difference=0.6232

To see the additivity connection, if you take the logarithm of the Odds Ratio (4.2), you get the exact same number: $$log(4.2)=0.6232$$Why this matters:
In logistic regression models, this property allows us to change an outcome by simply adding coefficients (e.g., 0.6232 to the log-odds), which represents multiplying the real-world odds of the event happening (e.g., multiplying the odds by 4.2).

2.4 Comprehensive Conversion Table

Probability p Odds Log-Odds (logit) Interpretation
0.01 0.0101 -4.60 Extremely unlikely
0.1 0.111 -2.20 Very unlikely
0.25 0.333 -1.10 Unlikely
0.5 1.0 0 Neutral (50-50)
0.75 3.0 1.10 Likely
0.9 9.0 2.20 Very likely
0.99 99.0 4.60 Extremely likely

Visual Pattern: Notice how log-odds are symmetric around 0:

2.5 Calculating Log-Odds (Example)

Given: Team wins 5 times, loses 3 times (from earlier example)

Step 1: Calculate probability

p=55+3=58=0.625

Step 2: Calculate odds

Odds=53=1.67

Or alternatively:

Odds=p1p=0.62510.625=0.6250.375=1.67

Step 3: Calculate log-odds

Log-Odds=log(1.67)=0.51

Interpretation:

2.6 The Logit Function in Context

Mathematical Overview of Logit functions

logit(p)=log(p1p)=β0+β1x1+β2x2++βnxn

Why this matters:

Inverse (Sigmoid Function):

p=11+e(β0+β1x1+)=elog-odds1+elog-odds

This is why the sigmoid function appears in logistic regression!


3. Odds Ratio

Transition to Comparisons

Now that we understand what odds are and why log-odds are useful, we need a way to compare odds between two groups. This is where the odds ratio comes in.

3.1 Definition

Odds Ratio (OR) compares the odds of an event occurring in two different groups.

Odds Ratio=Odds in Group 1Odds in Group 2

Purpose: Measure the strength of association between an exposure (e.g., treatment, risk factor) and an outcome (e.g., disease, success).

3.2 Interpreting Odds Ratios

Odds Ratio Interpretation
OR = 1 No association (exposure doesn't affect outcome)
OR > 1 Positive association (exposure increases odds of outcome)
OR < 1 Negative association (exposure decreases odds of outcome)
OR = 2 Exposure doubles the odds of outcome
OR = 0.5 Exposure halves the odds of outcome
OR = 3 Exposure triples the odds of outcome

3.3 Example: Smoking and Lung Cancer

Scenario: Study of 1000 people

Lung Cancer No Lung Cancer Total
Smokers 80 120 200
Non-smokers 20 780 800
Total 100 900 1000

Step 1: Calculate odds for each group

Step 2: Calculate Odds Ratio

OR=OddssmokersOddsnon-smokers=0.6670.0256=26.0

Interpretation: Smokers have 26 times higher odds of developing lung cancer compared to non-smokers.

3.4 Odds Ratio in Case-Control Studies

Odds ratios are particularly useful in retrospective studies (case-control studies) where you:

  1. Start with cases (people with disease) and controls (people without)
  2. Look backwards to see exposure rates

Why not use Relative Risk?

4. Log-Odds Ratio

Final Transformation

Just as we transformed odds to log-odds for better mathematical properties, we do the same for odds ratios.

4.1 Definition

Log-Odds Ratio (also called log OR) is the natural logarithm of the odds ratio.

Log-Odds Ratio=log(OR)=log(Odds1Odds2)

Equivalently:

Log-OR=log(Odds1)log(Odds2)=Log-Odds1Log-Odds2

4.2 Why Use Log-Odds Ratio?

Advantages over regular Odds Ratio

  1. Symmetry:

    • OR = 2 means doubling risk (Case: Odds of 2 wins and 1 loss)
    • OR = 0.5 means halving risk (Case: Odds of 1 wins and 2 loss)
    • But 2 and 0.5 are NOT symmetric around 1
    • log(2)=0.69 and log(0.5)=0.69 ARE symmetric around 0!
  2. Additivity:

    • Combining multiple effects: add log-odds ratios
    • Example: Effect A (log-OR = 0.5) + Effect B (log-OR = 0.3) = Combined (log-OR = 0.8)
  3. Statistical Properties:

    • Log-OR is approximately normally distributed (good for statistical tests)
    • Confidence intervals are symmetric on log scale
    • Easier to work with in regression models
  4. Effect Size Interpretation:

    • Positive log-OR: increased odds
    • Negative log-OR: decreased odds
    • Zero log-OR: no effect

4.3 Interpreting Log-Odds Ratios

OR Log-OR Interpretation
0 Impossible in exposed group
0.14 -2.0 86% reduction in odds
0.37 -1.0 63% reduction in odds
0.5 -0.69 Halves the odds
1 0 No effect
2 0.69 Doubles the odds
2.72 1.0 172% increase in odds
7.39 2.0 639% increase in odds
+ Certain in exposed group

4.4 Example Calculation

Using our smoking example where OR = 26:

Log-OR=log(26)=3.26

Interpretation:

Summary

Key Takeaways

  1. Probability vs Odds: Different ways to express likelihood

    • Probability: fraction of total
    • Odds: ratio of favorable to unfavorable
  2. Log-Odds (Logit): Transforms [0,1] to (-∞,+∞)

    • Creates symmetry
    • Foundation of logistic regression
    • Makes effects additive
  3. Odds Ratio: Measures association strength

    • OR = 1: No association
    • OR > 1: Positive association
    • OR < 1: Negative association