NumPy Random
NumPy Random — written as if we are sitting together at a table, I’m showing you examples on my screen, explaining every important concept, showing realistic use cases, warning about common traps, and giving you plenty of runnable code examples.
Let’s go step by step — just like a real class.
|
0 1 2 3 4 5 6 |
import numpy as np |
Why do we need NumPy’s random module?
numpy.random is the standard way to generate random numbers in almost all scientific, machine learning, data analysis, simulation, and testing work done in Python.
Why not use Python’s built-in random module?
- np.random is much faster for generating large arrays
- It produces ndarrays directly (ready for math, slicing, reshaping…)
- It has far more distributions (normal, uniform, binomial, poisson, beta, gamma…)
- It supports seeding for perfect reproducibility
- It works beautifully with vectorization and broadcasting
Golden rule #1 (write this down):
Whenever you need random numbers for data science, ML, simulations, or testing → use numpy.random — not random.
Step 1 – The most important habit: Setting the random seed
|
0 1 2 3 4 5 6 |
np.random.seed(42) # ← this line makes all random results reproducible |
Every time you run your code with the same seed → you get exactly the same random numbers.
This is extremely important when:
- Debugging
- Comparing experiments
- Sharing code
- Writing tests
- Teaching / tutorials
Example – without seed vs with seed
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
# Without seed → different every time print(np.random.rand(5)) # With seed → always the same np.random.seed(42) print(np.random.rand(5)) # [0.37454012 0.95071431 0.73199394 0.59865848 0.15601864] |
Teacher tip: Put np.random.seed(42) (or any fixed number) at the top of your notebooks/scripts when you start learning or experimenting. Later, remove it (or change it) when you want truly random behavior.
Step 2 – Most commonly used random functions
1. np.random.rand() – Uniform random numbers in [0, 1)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Single value print(np.random.rand()) # e.g. 0.3745401188473625 # 1D array print(np.random.rand(8)) # 2D array (very common) print(np.random.rand(4, 5)) # 3D array (e.g. batch of images) print(np.random.rand(32, 64, 64, 3).shape) # (32, 64, 64, 3) |
Realistic use case – creating synthetic features
|
0 1 2 3 4 5 6 7 |
# 1000 samples, 20 features X_synthetic = np.random.rand(1000, 20) |
2. np.random.randn() – Standard normal (Gaussian) distribution
Mean = 0, standard deviation = 1
|
0 1 2 3 4 5 6 7 8 9 10 |
print(np.random.randn(10)) # example: [-0.49671415 0.7680537 0.0884925 ...] # Very common shape in ML weights = np.random.randn(784, 10) # weights matrix for neural network layer |
Quick comparison – rand vs randn
|
0 1 2 3 4 5 6 7 8 9 10 |
uniform = np.random.rand(10000) normal = np.random.randn(10000) # uniform → flat between 0 and 1 # normal → bell curve around 0 |
3. np.random.randint() – Random integers
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# Single integer between low (inclusive) and high (exclusive) print(np.random.randint(1, 7)) # dice roll: 1–6 # Array of dice rolls dice = np.random.randint(1, 7, size=20) print(dice) # 2D – like scores or pixel values scores = np.random.randint(40, 101, size=(50, 4)) # 50 students × 4 subjects |
4. np.random.uniform() – Uniform with custom range
|
0 1 2 3 4 5 6 7 8 9 10 |
# Uniform between -5 and 5 print(np.random.uniform(-5, 5, size=10)) # Temperature simulation (18–32 °C) temps = np.random.uniform(18, 32, size=365) |
5. np.random.normal() – Normal with custom mean & std
|
0 1 2 3 4 5 6 7 8 9 10 11 |
# IQ scores: mean=100, std=15 iq_scores = np.random.normal(100, 15, size=1000) # Noisy measurements true_value = 50.0 noisy = true_value + np.random.normal(0, 0.8, size=200) |
6. np.random.choice() – Sampling from existing array
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
fruits = np.array(['apple', 'banana', 'orange', 'mango', 'grapes']) # Single choice print(np.random.choice(fruits)) # 20 choices with replacement print(np.random.choice(fruits, size=20)) # Without replacement (unique) print(np.random.choice(fruits, size=3, replace=False)) # With probabilities probs = [0.5, 0.2, 0.1, 0.1, 0.1] biased = np.random.choice(fruits, size=100, p=probs) |
7. np.random.shuffle() – Shuffle in place
|
0 1 2 3 4 5 6 7 8 |
deck = np.arange(1, 53) # cards 1 to 52 np.random.shuffle(deck) print(deck[:10]) # first 10 cards after shuffle |
Note: shuffle modifies the array in place — no return value.
Step 3 – Realistic & common use cases (you will write these often)
Use case 1 – Creating synthetic training data
|
0 1 2 3 4 5 6 7 8 9 10 11 |
np.random.seed(42) n_samples = 1000 X = np.random.randn(n_samples, 5) * 2 + 10 # centered around 10 noise = np.random.randn(n_samples) * 0.5 y = 3 * X[:,0] + 1.5 * X[:,1] - 2 * X[:,2] + noise # linear relation + noise |
Use case 2 – Train / validation / test split (manual)
|
0 1 2 3 4 5 6 7 8 9 10 11 |
data = np.random.randn(10000, 30) np.random.shuffle(data) # important! train = data[:7000] val = data[7000:8500] test = data[8500:] |
Use case 3 – Random missing values simulation
|
0 1 2 3 4 5 6 7 8 |
measurements = np.random.uniform(20, 80, 500) mask = np.random.random(500) < 0.1 # 10% missing measurements[mask] = np.nan |
Use case 4 – Random image-like noise
|
0 1 2 3 4 5 6 7 8 |
img = np.ones((200, 300, 3)) * 128 # gray image noise = np.random.normal(0, 25, img.shape) noisy_img = np.clip(img + noise, 0, 255).astype(np.uint8) |
Summary – Quick Reference Table
| Function | Distribution / Behavior | Typical shape example | Common seed usage |
|---|---|---|---|
| rand() | Uniform [0, 1) | rand(1000, 20) | Yes |
| randn() | Standard normal (μ=0, σ=1) | randn(784, 10) | Yes |
| randint(low, high) | Integers [low, high) | randint(1, 7, size=100) | Yes |
| uniform(low, high) | Uniform [low, high) | uniform(18, 32, 365) | Yes |
| normal(loc, scale) | Normal μ=loc, σ=scale | normal(100, 15, 1000) | Yes |
| choice() | Sample from given array | choice(fruits, 50) | Yes |
| shuffle() | Shuffle array in place | shuffle(deck) | Yes |
Final teacher advice
Always start your notebooks/experiments with:
|
0 1 2 3 4 5 6 7 |
import numpy as np np.random.seed(42) # or any fixed number you like |
This small habit will save you many hours of confusion later.
Where would you like to go next?
- Random distributions in more depth (binomial, poisson, beta…)
- Randomness in machine learning (data splitting, weight init, dropout…)
- Common bugs when using random numbers
- Mini-project: simulate data + add noise + clean it
- Reproducibility across multiple runs / files
Just tell me what you want to focus on now! 😊
