{"id":2520,"date":"2026-02-02T09:50:23","date_gmt":"2026-02-02T09:50:23","guid":{"rendered":"https:\/\/demo.materiamedica.net\/demo6\/?p=2520"},"modified":"2026-02-02T09:50:23","modified_gmt":"2026-02-02T09:50:23","slug":"chapter-12-chi-square-distribution","status":"publish","type":"post","link":"https:\/\/demo.materiamedica.net\/demo6\/chapter-12-chi-square-distribution\/","title":{"rendered":"Chapter 12: Chi Square Distribution"},"content":{"rendered":"<h3 dir=\"auto\">1. What is the Chi-Square distribution really?<\/h3>\n<p dir=\"auto\">The <strong>Chi-Square distribution<\/strong> (written \u03c7\u00b2) is the distribution of the <strong>sum of squares of k independent standard normal random variables<\/strong>.<\/p>\n<p dir=\"auto\">In simple words:<\/p>\n<blockquote dir=\"auto\">\n<p dir=\"auto\">If you take k independent random numbers from a standard normal distribution (mean=0, sd=1), square each of them, and add them all together \u2192 the result follows a <strong>Chi-Square distribution with k degrees of freedom<\/strong>.<\/p>\n<\/blockquote>\n<p dir=\"auto\"><strong>Mathematical definition<\/strong>:<\/p>\n<p dir=\"auto\">Let Z\u2081, Z\u2082, &#8230;, Z\u2096 ~ N(0,1) and independent Then: <strong>Q = Z\u2081\u00b2 + Z\u2082\u00b2 + &#8230; + Z\u2096\u00b2 ~ \u03c7\u00b2(k)<\/strong><\/p>\n<p dir=\"auto\"><strong>Key properties<\/strong> (write these down):<\/p>\n<ul dir=\"auto\">\n<li>Only defined for <strong>x \u2265 0<\/strong> (because squares are non-negative)<\/li>\n<li><strong>Degrees of freedom (df or k)<\/strong> is the only parameter<\/li>\n<li>Mean = <strong>k<\/strong><\/li>\n<li>Variance = <strong>2k<\/strong><\/li>\n<li>Shape: <strong>always right-skewed<\/strong>, but becomes more symmetric as k increases<\/li>\n<li>When k \u2265 30 \u2192 looks quite similar to a normal distribution (Central Limit Theorem)<\/li>\n<\/ul>\n<h3 dir=\"auto\">2. Visual intuition \u2013 how the shape changes with degrees of freedom<\/h3>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code>fig, axes = plt.subplots(2, 2, figsize=(14, 10), sharey=False)\r\n\r\ndfs = [1, 2, 5, 10, 20, 50]\r\n\r\nx = np.linspace(0, 80, 1000)\r\n\r\nfor df in dfs:\r\n    y = stats.chi2.pdf(x, df=df)\r\n    plt.plot(x, y, lw=2.2, label=f'df = {df}', alpha=0.9)\r\n\r\nplt.title(\"Chi-Square density for different degrees of freedom\", fontsize=14, pad=15)\r\nplt.xlabel(\"Value (\u03c7\u00b2)\", fontsize=12)\r\nplt.ylabel(\"Density\", fontsize=12)\r\nplt.xlim(0, 80)\r\nplt.legend(title=\"Degrees of freedom (k)\", fontsize=11, title_fontsize=12)\r\nplt.show()<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>What you should observe<\/strong>:<\/p>\n<ul dir=\"auto\">\n<li>df = 1 \u2192 very strong right skew, peaks at 0<\/li>\n<li>df = 2 \u2192 still skewed, but flatter<\/li>\n<li>df = 5 \u2192 peak moves right, skew decreases<\/li>\n<li>df = 10 \u2192 starting to look bell-like<\/li>\n<li>df = 20+ \u2192 almost symmetric, looks similar to normal<\/li>\n<\/ul>\n<h3 dir=\"auto\">3. Generating Chi-Square random numbers in NumPy \/ SciPy<\/h3>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Chi-Square with 5 degrees of freedom\r\nchi5 = stats.chi2.rvs(df=5, size=50000)\r\n\r\n# Multiple at once\r\nchi_data = stats.chi2.rvs(df=[3, 8, 15, 30], size=(40000, 4))\r\n\r\nprint(\"Mean of df=5 sample:\", chi5.mean().round(2))     # should be \u2248 5\r\nprint(\"Variance of df=5 sample:\", chi5.var().round(2))  # should be \u2248 10<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">4. Where does Chi-Square appear in real life? (very important)<\/h3>\n<p dir=\"auto\"><strong>Most common situations you will actually meet<\/strong><\/p>\n<ol dir=\"auto\">\n<li><strong>Variance testing<\/strong> Sample variance of normal data \u2192 (n-1)S\u00b2\/\u03c3\u00b2 ~ \u03c7\u00b2(n-1)<\/li>\n<li><strong>Goodness-of-fit test<\/strong> Comparing observed vs expected frequencies (classic \u03c7\u00b2 test)<\/li>\n<li><strong>Independence test<\/strong> Contingency tables (\u03c7\u00b2 test of independence)<\/li>\n<li><strong>Confidence interval for variance<\/strong> Used in quality control, process capability<\/li>\n<li><strong>F-distribution<\/strong> (very important connection) F = (\u03c7\u2081\u00b2 \/ df\u2081) \/ (\u03c7\u2082\u00b2 \/ df\u2082) \u2192 used in ANOVA, regression<\/li>\n<li><strong>Multiple linear regression<\/strong> Residual sum of squares \/ \u03c3\u00b2 ~ \u03c7\u00b2(n-p)<\/li>\n<\/ol>\n<h3 dir=\"auto\">5. Realistic examples &amp; code you will actually write<\/h3>\n<p dir=\"auto\"><strong>Example 1 \u2013 Testing variance of measurements<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Suppose true process variance \u03c3\u00b2 = 4 (sd = 2)\r\n# We take n=25 samples, compute sample variance S\u00b2\r\nn = 25\r\ndf = n - 1\r\n\r\n# Simulate many such experiments\r\nsample_variances_scaled = np.zeros(20000)\r\n\r\nfor i in range(20000):\r\n    sample = np.random.normal(0, 2, n)          # sd = 2\r\n    S2 = np.var(sample, ddof=1)                 # sample variance\r\n    sample_variances_scaled[i] = (n-1) * S2 \/ 4   # scaled \u2192 should be \u03c7\u00b2(24)\r\n\r\nsns.histplot(sample_variances_scaled, bins=80, stat=\"density\", kde=True,\r\n             color=\"teal\", alpha=0.7)\r\n\r\nx = np.linspace(0, 80, 1000)\r\nplt.plot(x, stats.chi2.pdf(x, df=24), color=\"darkred\", lw=2.8, label=\"Theoretical \u03c7\u00b2(24)\")\r\nplt.title(\"Scaled sample variance ~ \u03c7\u00b2(n-1)\", fontsize=14)\r\nplt.legend()\r\nplt.show()<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p dir=\"auto\"><strong>Example 2 \u2013 Chi-Square goodness-of-fit (classic dice test)<\/strong><\/p>\n<div dir=\"auto\">\n<div data-testid=\"code-block\">\n<div>\n<div>Python<\/div>\n<div>\n<pre tabindex=\"0\"><code># Simulate rolling a fair die 6000 times\r\nobserved = np.random.multinomial(6000, [1\/6]*6)\r\n\r\nexpected = 6000 \/ 6\r\n\r\nchi2_stat = np.sum((observed - expected)**2 \/ expected)\r\ndf = 6 - 1\r\n\r\nprint(\"Chi-square statistic:\", chi2_stat.round(3))\r\nprint(\"p-value:\", stats.chi2.sf(chi2_stat, df).round(5))<\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">6. Summary \u2013 Chi-Square Distribution Quick Reference<\/h3>\n<div>\n<div dir=\"auto\">\n<table dir=\"auto\">\n<thead>\n<tr>\n<th data-col-size=\"md\">Property<\/th>\n<th data-col-size=\"lg\">Value \/ Formula<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td data-col-size=\"md\">Shape<\/td>\n<td data-col-size=\"lg\">Right-skewed (skew decreases as df increases)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Defined by<\/td>\n<td data-col-size=\"lg\">degrees of freedom <strong>k<\/strong> (or <strong>df<\/strong>)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Mean<\/td>\n<td data-col-size=\"lg\"><strong>k<\/strong><\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Variance<\/td>\n<td data-col-size=\"lg\"><strong>2k<\/strong><\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Standard deviation<\/td>\n<td data-col-size=\"lg\"><strong>\u221a(2k)<\/strong><\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Support<\/td>\n<td data-col-size=\"lg\">x \u2265 0<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">PDF<\/td>\n<td data-col-size=\"lg\">complicated (involves gamma function)<\/td>\n<\/tr>\n<tr>\n<td data-col-size=\"md\">Most common use cases<\/td>\n<td data-col-size=\"lg\">variance testing, goodness-of-fit, independence tests, F-distribution, regression diagnostics<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div><\/div>\n<\/div>\n<\/div>\n<h3 dir=\"auto\">Final teacher messages<\/h3>\n<ol dir=\"auto\">\n<li><strong>Whenever you see \u201csum of squares of normal variables\u201d or \u201cscaled variance\u201d<\/strong> \u2192 think Chi-Square.<\/li>\n<li><strong>Chi-Square is the building block<\/strong> for many other important distributions:\n<ul dir=\"auto\">\n<li>F-distribution<\/li>\n<li>Chi-Square goodness-of-fit \/ independence tests<\/li>\n<li>Confidence intervals for variance<\/li>\n<\/ul>\n<\/li>\n<li><strong>As df increases<\/strong> \u2192 Chi-Square becomes more symmetric \u2192 normal approximation becomes good (mean=k, variance=2k)<\/li>\n<\/ol>\n<p dir=\"auto\">Would you like to go deeper into any of these next?<\/p>\n<ul dir=\"auto\">\n<li>How to perform a real Chi-Square goodness-of-fit test step-by-step<\/li>\n<li>Chi-Square vs F-distribution (very important connection)<\/li>\n<li>Confidence interval for population variance using Chi-Square<\/li>\n<li>Realistic mini-project: test whether dice rolls are fair<\/li>\n<li>Difference between Chi-Square and non-central Chi-Square<\/li>\n<\/ul>\n<p dir=\"auto\">Just tell me what feels most useful or interesting for you right now! \ud83d\ude0a<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. What is the Chi-Square distribution really? The Chi-Square distribution (written \u03c7\u00b2) is the distribution of the sum of squares of k independent standard normal random variables. In simple words: If you take k&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[75],"tags":[],"class_list":["post-2520","post","type-post","status-publish","format-standard","hentry","category-numpy"],"_links":{"self":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2520","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/comments?post=2520"}],"version-history":[{"count":1,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2520\/revisions"}],"predecessor-version":[{"id":2521,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/posts\/2520\/revisions\/2521"}],"wp:attachment":[{"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/media?parent=2520"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/categories?post=2520"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/demo.materiamedica.net\/demo6\/wp-json\/wp\/v2\/tags?post=2520"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}