Chapter 12: NumPy Array Iterating
NumPy Array Iterating — written as if a patient teacher is sitting next to you, explaining slowly, showing many realistic examples, pointing out traps, comparing different methods, and telling you which one you should actually use in real life.
Let’s go step by step.
|
0 1 2 3 4 5 6 |
import numpy as np |
1. Why is iterating in NumPy different from normal Python?
In normal Python, we almost always iterate over lists with for loops:
|
0 1 2 3 4 5 6 7 8 |
lst = [10, 20, 30, 40] for x in lst: print(x * 2) |
In NumPy → you should avoid simple Python for-loops whenever possible → because they are very slow compared to vectorized operations.
Golden rule (write this down):
If you are using a Python for loop to do math on every element of a NumPy array → you are almost certainly doing it wrong
But… sometimes you do need to iterate. So NumPy gives you several ways — some good, some acceptable, some very bad.
2. The five main ways to iterate over NumPy arrays
Let’s look at them from worst to best (in terms of when you should use them).
Method 1 – Plain Python for loop (usually the worst choice)
|
0 1 2 3 4 5 6 7 8 9 10 |
arr = np.array([10, 20, 30, 40, 50]) # Very slow & not NumPy style for x in arr: print(x ** 2) |
When is this acceptable? Only when:
- The array is very small (< 100 elements)
- You are doing something complicated that cannot be vectorized
- You are printing/debugging
Never do math like this on large arrays.
Method 2 – Using .flat (iterates over all elements as 1D)
|
0 1 2 3 4 5 6 7 8 9 10 11 |
mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) for x in mat.flat: print(x, end=' ') # 1 2 3 4 5 6 7 8 9 |
→ .flat gives you a 1D iterator over all elements, no matter the shape.
Use case example — very simple element inspection:
|
0 1 2 3 4 5 6 7 8 |
for val in big_array.flat: if val < 0: print("Found negative value!") |
Still slow for large arrays — but better than nested loops.
Method 3 – np.nditer – the most flexible (and most misunderstood) iterator
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
arr = np.array([[10, 20, 30], [40, 50, 60]]) it = np.nditer(arr) for x in it: print(x, end=' ') # 10 20 30 40 50 60 |
Important modes / flags — this is where nditer becomes powerful
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Read-only (default) for x in np.nditer(arr): print(x) # Read and write (very useful!) for x in np.nditer(arr, op_flags=['readwrite']): x[...] = x * 2 # modifies the original array! print(arr) # [[ 20 40 60] # [ 80 100 120]] |
Common realistic use case – modify array in place with condition
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
data = np.random.randint(-50, 50, (5, 6)) for x in np.nditer(data, op_flags=['readwrite']): if x < 0: x[...] = 0 print(data) # all negative values replaced with 0 |
Flags cheat sheet (most useful ones)
| Flag | Meaning |
|---|---|
| readwrite | allow reading and writing |
| readonly | default – cannot modify |
| writeonly | only writing (rare) |
| buffered | better performance for some operations |
| common_dtype | force same dtype for all operands |
Method 4 – np.ndenumerate – when you need both index and value
Very useful when you need to know where you are in the array.
|
0 1 2 3 4 5 6 7 8 9 10 11 |
mat = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]]) for index, value in np.ndenumerate(mat): print(f"Position {index} → value {value}") |
Output:
|
0 1 2 3 4 5 6 7 8 9 |
Position (0, 0) → value 10 Position (0, 1) → value 20 ... Position (2, 2) → value 90 |
Realistic use case — replace values based on position
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
grid = np.zeros((5, 5)) for idx, _ in np.ndenumerate(grid): row, col = idx if row == col: grid[idx] = 1 # put 1 on diagonal print(grid) # [[1. 0. 0. 0. 0.] # [0. 1. 0. 0. 0.] # [0. 0. 1. 0. 0.] # [0. 0. 0. 1. 0.] # [0. 0. 0. 0. 1.]] |
Method 5 – Vectorized operations – the NumPy way (best in 95% of cases)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
arr = np.arange(1, 11) # DON'T do this: # for i in range(len(arr)): # arr[i] = arr[i] ** 2 # DO this: arr = arr ** 2 print(arr) # [ 1 4 9 16 25 36 49 64 81 100] |
Even with conditions:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
values = np.random.randn(1000) # Slow and bad: # for i in range(len(values)): # if values[i] < 0: # values[i] = 0 # Fast and correct: values[values < 0] = 0 |
With multiple arrays:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
a = np.array([1, 2, 3, 4]) b = np.array([10, 20, 30, 40]) # DON'T: # result = [] # for x, y in zip(a, b): # result.append(x + y * 2) # DO: result = a + b * 2 |
Summary – Which iteration method should you use?
| Situation | Recommended method |
|---|---|
| Doing math / logical operations on elements | Vectorized operations (best) |
| Need index and value | np.ndenumerate(…) |
| Need to modify array in place with complex logic | np.nditer(…, op_flags=[‘readwrite’]) |
| Just want to look at / count / print all values | .flat or nditer (read-only) |
| Array is tiny (< 100 elements) & logic is weird | plain Python for is acceptable |
| You are doing serious data processing | avoid loops — use vectorization, ufuncs, masking |
Quick Decision Flowchart
- Can I write this with vectorized operations (+, *, **, np.where, boolean indexing, etc.)? → Yes → do it (fastest, cleanest)
- Do I need both value and position (index)? → Yes → np.ndenumerate
- Do I need to modify the array in place with somewhat complex logic? → Yes → np.nditer with readwrite
- Do I just want to inspect/read every element? → Yes → .flat or simple nditer
- Everything else / very small array? → plain Python for loop
Would you like to go deeper into any of these?
- More realistic examples with np.nditer flags
- How to iterate over multiple arrays at once (broadcast_arrays)
- Speed comparison (vectorized vs loops)
- Common bugs when using nditer for writing
- Mini-exercise: clean/modify an array using different methods
Just tell me what feels most useful right now! 😊
