Chapter 9: Logistic Distribution
1. What is the logistic distribution really?
The logistic distribution is a continuous symmetric probability distribution that looks very similar to the normal distribution — but with heavier tails.
Its probability density function has a characteristic S-shaped cumulative distribution function (CDF), which is why it is so important in modeling growth processes and probability transitions.
Key visual features:
- Bell-shaped density (symmetric around the mean)
- Heavier tails than normal → more probability mass in the extremes
- The CDF is the famous logistic function (also called sigmoid)
The logistic function looks like this:
|
0 1 2 3 4 5 6 7 8 9 10 11 |
______ / \ / \ / \ / \ ___/ \___ |
This S-shape appears everywhere when something transitions from “almost 0” to “almost 1” (or low → high, failure → success, etc.).
2. Two main parameterizations you will see
-
Location-scale form (most common in statistics / NumPy / SciPy)
- μ (location) = mean = center of the distribution
- s (scale) = controls the width / steepness
Standard logistic: μ = 0, s = 1
-
Logistic regression form (very common in machine learning)
- Often written with scale = 1 (or implicitly s = 1)
- The CDF becomes exactly the sigmoid function used in logistic regression:
P(Y=1) = 1 / (1 + exp(-(x − μ)/s))
3. Generating logistic random numbers in NumPy
NumPy does not have a built-in np.random.logistic(), but we can use scipy.stats very easily.
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
from scipy.stats import logistic # Standard logistic (μ=0, s=1) standard_logistic = logistic.rvs(size=100000) # Custom location & scale mu = 50 s = 8 custom = logistic.rvs(loc=mu, scale=s, size=80000) |
Alternative way using NumPy (transform uniform):
|
0 1 2 3 4 5 6 7 8 |
# Logistic = inverse CDF of uniform u = np.random.rand(100000) logistic_transformed = mu + s * np.log(u / (1 - u)) |
Both give the same result.
4. Visual comparison: Logistic vs Normal
This is the most important picture to understand the difference.
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5.5)) # Same mean & "similar" spread mu, s_log = 0, 1.6 # scale tuned so variance ≈ 1 sigma_norm = 1 # Generate data log_data = logistic.rvs(loc=mu, scale=s_log, size=80000) norm_data = np.random.normal(loc=mu, scale=sigma_norm, size=80000) # Plot densities sns.kdeplot(log_data, label=f"Logistic (s={s_log})", color="teal", lw=2.5, ax=ax1) sns.kdeplot(norm_data, label="Normal (σ=1)", color="coral", lw=2.5, linestyle="--", ax=ax1) ax1.set_title("Density comparison – Logistic vs Normal", fontsize=13) ax1.set_xlim(-6, 6) ax1.legend(fontsize=11) # Cumulative distribution (CDF) — the famous sigmoid x = np.linspace(-6, 6, 1000) ax2.plot(x, logistic.cdf(x, loc=mu, scale=s_log), color="teal", lw=2.5, label="Logistic CDF") ax2.plot(x, stats.norm.cdf(x, loc=mu, scale=sigma_norm), color="coral", lw=2.5, linestyle="--", label="Normal CDF") ax2.set_title("Cumulative distribution (CDF) – the S-curve", fontsize=13) ax2.legend(fontsize=11) plt.tight_layout() plt.show() |
What you should notice:
- Logistic has heavier tails → more probability far from the mean
- Logistic CDF rises more slowly near the tails → wider transition zone
- Normal CDF is steeper near the mean → sharper transition
5. Real-world situations where logistic appears naturally
| Field | Typical use of logistic distribution / function |
|---|---|
| Logistic regression | The sigmoid = CDF of logistic is the link function |
| Growth models | Population growth, technology adoption, spread of ideas |
| Credit scoring / risk modeling | Probability of default, churn probability |
| Bioassay / dose-response | Probability of response as function of dose |
| Psychometrics | Item response theory (Rasch model, 2PL model) |
| Market penetration | Fraction of market that adopted a product |
| Survival analysis | Log-logistic distribution (accelerated failure time) |
| Neural networks | Sigmoid activation (historically), still used in some contexts |
6. Realistic code patterns you will actually write
Pattern 1 – Simulate probability of event
|
0 1 2 3 4 5 6 7 8 9 10 11 |
# Temperature → probability of air conditioner being on temp = np.random.normal(28, 4, 10000) # summer temperatures prob_on = logistic.cdf(temp, loc=26, scale=3) # 50% chance at 26°C ac_on = np.random.rand(10000) < prob_on print(f"Average AC usage: {ac_on.mean():.1%}") |
Pattern 2 – Compare logistic vs normal tails
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
extreme = np.linspace(-8, 8, 10000) logistic_tail = 1 - logistic.cdf(extreme, loc=0, scale=1) normal_tail = 1 - stats.norm.cdf(extreme) plt.semilogy(extreme, logistic_tail, label="Logistic tail", lw=2.5) plt.semilogy(extreme, normal_tail, label="Normal tail", lw=2.5, linestyle="--") plt.title("Log-scale tail probability – Logistic has heavier tails") plt.ylabel("1 − CDF (log scale)") plt.legend() plt.grid(True, which="both", ls="--", alpha=0.5) plt.show() |
Pattern 3 – Logistic CDF as smooth step function
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
x = np.linspace(-6, 6, 1000) for scale in [0.5, 1.0, 2.0, 4.0]: plt.plot(x, logistic.cdf(x, loc=0, scale=scale), label=f"s = {scale}", lw=2.2) plt.title("Logistic CDF – different scale parameters") plt.xlabel("x") plt.ylabel("Probability / proportion") plt.legend(title="Scale parameter") plt.show() |
Summary – Logistic Distribution Quick Reference
| Property | Value / Formula / Behavior |
|---|---|
| Shape | Symmetric bell (heavier tails than normal) |
| Defined by | location μ, scale s |
| Mean | μ |
| Variance | π² s² / 3 ≈ 3.2899 s² |
| Standard deviation | π s / √3 ≈ 1.8138 s |
| CDF (most important part) | 1 / (1 + exp(-(x − μ)/s)) |
| NumPy / SciPy | scipy.stats.logistic.rvs(loc=μ, scale=s, size=…) |
| Heaviest tails among common | Normal < Logistic < Cauchy |
| Most famous appearance | Sigmoid / logistic function |
Final teacher messages
- Whenever you see an S-shaped curve modeling a transition from 0 → 1 (or low → high), think logistic CDF.
- Logistic distribution ≈ normal but with heavier tails — useful when extreme values are more likely than normal would predict.
- In machine learning, you meet the logistic CDF (sigmoid) constantly — even if you rarely generate random logistic numbers directly.
Would you like to continue with any of these topics?
- Logistic vs normal vs Cauchy tails in depth
- How logistic regression uses the logistic CDF
- Log-logistic distribution (survival analysis)
- Realistic mini-project: simulate adoption curve / dose-response
- Comparing several symmetric distributions side-by-side
Just tell me what you would like to explore next! 😊
