{"id":2501,"date":"2026-02-02T09:23:12","date_gmt":"2026-02-02T09:23:12","guid":{"rendered":"https:\/\/demo.materiamedica.net\/demo6\/?p=2501"},"modified":"2026-02-02T09:23:51","modified_gmt":"2026-02-02T09:23:51","slug":"chapter-2-random-permutations","status":"publish","type":"post","link":"https:\/\/demo.materiamedica.net\/demo6\/chapter-2-random-permutations\/","title":{"rendered":"Chapter 3: Random Permutations"},"content":{"rendered":"<h3 dir=\"auto\">What is a permutation? (quick honest explanation)<\/h3>\n<p dir=\"auto\">A <strong>permutation<\/strong> is simply a <strong>rearrangement<\/strong> of the elements of a sequence.<\/p>\n<p dir=\"auto\">Examples:<\/p>\n<ul dir=\"auto\">\n<li>Original: [A, B, C, D]<\/li>\n<li>One permutation: [B, D, A, C]<\/li>\n<li>Another: [D, A, C, B]<\/li>\n<li>etc.<\/li>\n<\/ul>\n<p dir=\"auto\">There are <strong>n!<\/strong> possible permutations of n distinct items.<\/p>\n<p dir=\"auto\">NumPy gives us two main ways to create random permutations:<\/p>\n<div>\n<div dir=\"auto\">\n<table dir=\"auto\">\n<thead>\n<tr>\n<th data-col-size=\"lg\">Method<\/th>\n<th data-col-size=\"xl\">What it does<\/th>\n<th data-col-size=\"xs\">Modifies original?<\/th>\n<th data-col-size=\"md\">Returns<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td data-col-size=\"lg\">np.random.permutation()<\/td>\n<td data-col-size=\"xl\">Returns a <strong>new<\/strong> randomly shuffled copy<\/td>\n<td data-col-size=\"xs\"><strong>No<\/strong><\/td>\n<td data-col-size=\"md\">new array<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">np.random.shuffle()<\/td>\n<td data-col-size=\"xl\"><strong>Shuffles in place<\/strong> (modifies original)<\/td>\n<td data-col-size=\"xs\"><strong>Yes<\/strong><\/td>\n<td data-col-size=\"md\">None<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div><\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">1. np.random.permutation() \u2014 most commonly used<\/h3>\n<p dir=\"auto\">Creates a <strong>new shuffled copy<\/strong> \u2014 original stays unchanged.<\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Simple 1D example\r\nnumbers = np.arange(10)          # [0 1 2 3 4 5 6 7 8 9]\r\n\r\nshuffled = np.random.permutation(numbers)\r\nprint(\"Original:\", numbers)\r\nprint(\"Shuffled :\", shuffled)<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Important observation<\/strong> \u2014 every time you run it \u2192 different order (unless seeded)<\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>np.random.seed(42)\r\nprint(np.random.permutation(numbers))\r\n# [6 3 7 2 9 1 8 4 0 5]<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Very common pattern \u2014 permuting indices<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># 1000 samples \u2014 we want to shuffle the order randomly\r\nindices = np.arange(1000)\r\n\r\nrandom_order = np.random.permutation(indices)\r\n\r\n# Now we can use this to shuffle data\r\nX = np.random.randn(1000, 20)          # features\r\ny = np.random.randint(0, 2, 1000)      # labels\r\n\r\nX_shuffled = X[random_order]\r\ny_shuffled = y[random_order]<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>This is exactly what train_test_split does behind the scenes.<\/strong><\/p>\n<h3 dir=\"auto\">2. np.random.shuffle() \u2014 shuffles in place<\/h3>\n<p dir=\"auto\"><strong>Modifies the array directly<\/strong> \u2014 does <strong>not<\/strong> return anything.<\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>deck = np.arange(1, 53)   # cards 1 to 52\r\nprint(\"Before:\", deck[:10])\r\n\r\nnp.random.shuffle(deck)\r\n\r\nprint(\"After :\", deck[:10])<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Very common mistake students make:<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># WRONG \u2014 shuffle returns None!\r\nwrong = np.random.shuffle(deck)      # wrong = None\r\nprint(wrong)                         # None<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Correct usage:<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>np.random.shuffle(deck)             # modifies deck directly<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">3. Quick comparison table (very useful to remember)<\/h3>\n<div>\n<div dir=\"auto\">\n<table dir=\"auto\">\n<thead>\n<tr>\n<th data-col-size=\"lg\">Property<\/th>\n<th data-col-size=\"lg\">np.random.permutation()<\/th>\n<th data-col-size=\"md\">np.random.shuffle()<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td data-col-size=\"lg\">Returns<\/td>\n<td data-col-size=\"lg\">new shuffled array<\/td>\n<td data-col-size=\"md\">None<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Modifies original?<\/td>\n<td data-col-size=\"lg\">No<\/td>\n<td data-col-size=\"md\"><strong>Yes<\/strong><\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Can take integer N?<\/td>\n<td data-col-size=\"lg\"><strong>Yes<\/strong> \u2014 creates 0..N-1 shuffled<\/td>\n<td data-col-size=\"md\"><strong>No<\/strong><\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Memory usage<\/td>\n<td data-col-size=\"lg\">creates copy<\/td>\n<td data-col-size=\"md\">no extra memory<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Most common use case<\/td>\n<td data-col-size=\"lg\">creating shuffled indices, new copy<\/td>\n<td data-col-size=\"md\">shuffling existing dataset in place<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div><\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">4. Special &amp; very useful feature of permutation()<\/h3>\n<p dir=\"auto\">You can pass an <strong>integer<\/strong> instead of an array!<\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Create a random permutation of 0, 1, 2, ..., 999\r\nidx = np.random.permutation(1000)\r\n\r\n# Same as:\r\nidx = np.arange(1000)\r\nnp.random.shuffle(idx)     # but this modifies in place<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>This is extremely common<\/strong> when you want to shuffle indices without creating the full arange first.<\/p>\n<h3 dir=\"auto\">5. Realistic patterns you will use again and again<\/h3>\n<p dir=\"auto\"><strong>Pattern 1 \u2013 Shuffle dataset before training<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Full dataset\r\nX = np.random.randn(15000, 35)\r\ny = np.random.randint(0, 3, 15000)\r\n\r\n# Shuffle\r\nperm = np.random.permutation(len(X))\r\n\r\nX = X[perm]\r\ny = y[perm]<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Pattern 2 \u2013 Create k-fold cross-validation indices<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>n = 12000\r\nindices = np.random.permutation(n)\r\n\r\nfold_size = n \/\/ 5\r\n\r\nfor i in range(5):\r\n    val_start = i * fold_size\r\n    val_end   = (i+1) * fold_size\r\n    \r\n    val_idx   = indices[val_start:val_end]\r\n    train_idx = np.concatenate([indices[:val_start], indices[val_end:]])\r\n    \r\n    print(f\"Fold {i+1}: train={len(train_idx)}, val={len(val_idx)}\")<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Pattern 3 \u2013 Random sampling without replacement<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>all_customers = np.arange(50000)\r\nselected = np.random.permutation(all_customers)[:500]   # first 500 after shuffle<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Pattern 4 \u2013 Shuffle rows of a matrix<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>matrix = np.random.randint(0, 100, size=(1000, 6))\r\nnp.random.shuffle(matrix)          # rows are shuffled in place\r\n# or\r\nmatrix = matrix[np.random.permutation(len(matrix))]<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">Summary \u2013 Quick Decision Guide<\/h3>\n<div>\n<div dir=\"auto\">\n<table dir=\"auto\">\n<thead>\n<tr>\n<th data-col-size=\"lg\">You want to&#8230;<\/th>\n<th data-col-size=\"md\">Best choice<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td data-col-size=\"lg\">Get a new shuffled version (keep original)<\/td>\n<td data-col-size=\"md\">np.random.permutation(arr)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Shuffle existing array in place (save memory)<\/td>\n<td data-col-size=\"md\">np.random.shuffle(arr)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Create random order of 0..n-1<\/td>\n<td data-col-size=\"md\">np.random.permutation(n)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Shuffle rows of a 2D array<\/td>\n<td data-col-size=\"md\">either \u2014 shuffle() or permutation + indexing<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"lg\">Need shuffled indices for splitting<\/td>\n<td data-col-size=\"md\">np.random.permutation(len(data))<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div><\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">Final teacher advice<\/h3>\n<p dir=\"auto\"><strong>Always<\/strong> think about whether you need the original order preserved:<\/p>\n<ul dir=\"auto\">\n<li>Need original later \u2192 use permutation()<\/li>\n<li>Don\u2019t need original + want to save memory \u2192 use shuffle()<\/li>\n<li>Working with indices \u2192 permutation(n) is usually cleanest<\/li>\n<\/ul>\n<p dir=\"auto\"><strong>Always<\/strong> set a seed when you want reproducible shuffling:<\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>np.random.seed(42)\r\nperm = np.random.permutation(1000)<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\">Where would you like to go next?<\/p>\n<ul dir=\"auto\">\n<li>Difference between permutation and choice (sampling)<\/li>\n<li>Shuffling multi-dimensional arrays correctly<\/li>\n<li>Random permutations in machine learning pipelines<\/li>\n<li>Common bugs when shuffling labels\/data separately<\/li>\n<li>Mini-exercise: shuffle a dataset and create train\/val splits<\/li>\n<\/ul>\n<p dir=\"auto\">Just tell me what feels most useful or interesting right now! \ud83d\ude0a<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What is a permutation? (quick honest explanation) A permutation is simply a rearrangement of the elements of a sequence. Examples: Original: [A, B, C, D] One permutation: [B, D, A, C] Another: [D, A,&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[75],"tags":[],"class_list":["post-2501","post","type-post","status-publish","format-standard","hentry","category-numpy"],"_links":{"self":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2501","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/comments?post=2501"}],"version-history":[{"count":2,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2501\/revisions"}],"predecessor-version":[{"id":2503,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2501\/revisions\/2503"}],"wp:attachment":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/media?parent=2501"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/categories?post=2501"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/tags?post=2501"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}