Week 9 — Lecture Outline · The Normal Distribution
Course: Introduction to Statistics (MATH 11) · Silver Oak University (fictional sample) · Prof. Rivera
Objectives covered: Objective 5 — Use normal distributions to reason about variability.
SLOs touched: A (reason quantitatively from data) · B (communicate results to a non-technical audience)
Meeting pattern: 2 sessions × 75 min = 150 min. Segment minutes below total ~150; scale to your own pattern.
Week at a Glance
| The week's big question | "When a value comes from a bell-shaped world, how unusual is it — and how do we put a number on that?" |
| By the end of the week, students can… | (1) read a density curve and say why the area under it is a proportion; (2) apply the 68–95–99.7 (empirical) rule to a normal variable; (3) standardize any value into a z-score and say what it means in plain words; (4) turn a z-score into an area or percentile using a supplied table; (5) judge whether data are roughly normal and refuse the rule when they aren't. |
| Key vocabulary | density curve, area-as-proportion, normal distribution, bell curve, mean μ, standard deviation σ, the 68–95–99.7 (empirical) rule, standardizing, z-score, standard normal, cumulative area (area to the left), percentile, right-tail area, "between" area, assessing normality, skew, outlier |
| Materials | slides (Deck 9), the week's readings + video links, a spreadsheet (Google Sheets or Excel), one approved chatbot (Gemini / Claude / ChatGPT) for the AI-critique moment and the tutorial, and the small z-table embedded in this outline (we hand students the values — they never recall them) |
| Timing note | 8 segments, ~150 min total. Session 1 = Segments 1–4 (~75). Session 2 = Segments 5–8 (~75). |
The z-table we use all week (cumulative area to the LEFT of z). Every worked example, practice item, quiz item, and assignment problem is engineered to land exactly on one of these friendly values. We supply these numbers; we never ask a student — or a chatbot — to recall or estimate them.
z −3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0 +0.5 +1.0 +1.5 +2.0 +2.5 +3.0 area to LEFT .0013 .0062 .0228 .0668 .1587 .3085 .5000 .6915 .8413 .9332 .9772 .9938 .9987 area to RIGHT .9987 .9938 .9772 .9332 .8413 .6915 .5000 .3085 .1587 .0668 .0228 .0062 .0013 Three moves are all we ever do: area to the left = read the table; area to the right = 1 − (area to the left); area between two z's = (left of the bigger) − (left of the smaller). Left + right always = 1.
Segment 1 — Hook & the Promise (8 min) · Session 1 opens
Hook. Put one number on the board: a test score of 80. Ask: "Is 80 good?" Let them answer — they can't, not really. "It depends," someone says. On what?
- "It depends on the class. An 80 on a quiz where everyone scored in the 90s is bad news. An 80 where the average was 60 is a triumph. The raw number 80 tells you almost nothing until you know the world it came from — the center and the spread."
- "This week we build the most famous 'world' in statistics — the bell curve — and a single tool, the z-score, that turns any value into one honest sentence: how many standard deviations from average am I, and is that unusual?"
The promise (write it on the board): "By the end of this week you can take any value — a test score, a height, a price, a battery's lifespan — and say exactly how unusual it is, as a percentile, using nothing but the mean, the standard deviation, and a small table."
Why it matters line (memory hook): "A raw number is a stranger. A z-score is an introduction — it tells you where the value stands in its own crowd."
Segment 2 — Density Curves & Area-as-Proportion (18 min)
Plain language first.
- Back in Week 2 we drew histograms — bars over classes, heights = counts. A density curve is the smooth idealized version: imagine shrinking the bars thinner and thinner until the jagged top becomes a smooth curve. It's a model of the shape, not the raw data.
- The one rule that makes a density curve useful: the total area underneath it is exactly 1 (that's 100% of the data). So any area under the curve is a proportion — the share of values that fall in that region. "Area = proportion" is the whole idea; everything else this week is arithmetic on areas.
- The normal distribution is the specific bell-shaped density curve we care about: symmetric, single-peaked, tails that thin out smoothly. Two numbers pin it down completely — its mean μ (where the peak sits) and its standard deviation σ (how wide and flat vs. tall and skinny it is). Notation note (after the idea): we write it N(μ, σ), read "normal with mean μ and standard deviation σ."
Memory hook (put it on a slide):
Area under the curve = the share of data there. The curve is a model; the area is the answer.
One fully worked example (do it out loud, no table yet — just halves).
Setup: Final exam scores in a large section are roughly normal with mean μ = 70 and standard deviation σ = 10. We write this N(70, 10).
- Because the curve is symmetric about the mean, exactly half the area sits on each side of 70. So the share of students scoring above 70 is 0.5 → 50%, and below 70 is also 50%.
- That's an area answer to a proportion question, with no table needed — symmetry alone does it. "Where's the middle? The mean. How much is above it? Half." Everything harder is just this, with the table filling in the non-half pieces.
Land the key idea: the mean splits a normal curve into two equal halves; from here on, the table tells us the size of the off-center pieces.
Segment 3 — The 68–95–99.7 (Empirical) Rule (25 min)
Plain language first. For any normal distribution, the proportion of data within 1, 2, and 3 standard deviations of the mean is always the same three numbers. Memorize them once and you can describe any bell curve on sight.
- About 68% of the data fall within 1 standard deviation of the mean (between μ − σ and μ + σ).
- About 95% fall within 2 standard deviations (between μ − 2σ and μ + 2σ).
- About 99.7% fall within 3 standard deviations (between μ − 3σ and μ + 3σ).
Memory hook: 68–95–99.7, the "empirical rule." "One, two, three SDs — sixty-eight, ninety-five, ninety-nine-point-seven."
The tails are where "unusual" lives (say this slowly). Because the curve is symmetric, the leftover area splits evenly into two tails:
- Outside ±1σ → 100% − 68% = 32%, so 16% in each tail.
- Outside ±2σ → 100% − 95% = 5%, so 2.5% in each tail. This is the working definition of "unusual" all week: beyond 2 SDs.
- Outside ±3σ → 100% − 99.7% = 0.3%, so 0.15% in each tail — genuinely rare.
One fully worked example (use the test scores N(70, 10)):
Mark the ruler under the curve: 40 50 60 70 80 90 100 — these are μ − 3σ up to μ + 3σ in steps of one σ.
- About 68% of scores fall between 60 and 80 (within 1σ).
- About 95% fall between 50 and 90 (within 2σ).
- About 99.7% fall between 40 and 100 (within 3σ).
- So the share scoring above 80 is the upper tail beyond +1σ ≈ 16%. Above 90 (beyond +2σ) ≈ 2.5%. A 40 is 3 SDs below the mean — about 0.15% of students score that low. "A 40 here isn't just bad; it's a once-in-a-section event."
The "draw-the-ruler" habit to give students: every empirical-rule problem starts the same way — write μ in the middle, then step out by σ three times each direction and label the seven tick marks. The percentages are then just bands you read off.
Segment 4 — Misconceptions + Quick Interaction (24 min) · Session 1 closes (~75)
Name the misconceptions out loud, then cure each:
- ❌ "The empirical rule works on any data."
✅ Cure: it works only when the data are roughly normal (symmetric, bell-shaped). On skewed data — incomes, home prices, wait times — "95% within 2 SDs" can be flat wrong. First ask "is this even a bell?"; only then reach for 68–95–99.7. (We'll make this a full segment Sunday.) - ❌ "A negative z-score means a negative value / a mistake."
✅ Cure: a z-score just says which side of the mean and how far. Negative simply means below the mean. A height of 61.5 inches in a μ = 64 world has z = −1 — perfectly ordinary, just below average. The sign is a direction, not an error. - ❌ "Bigger z-score = bigger raw value, always."
✅ Cure: only within one distribution. Across two different distributions, the bigger z is the more unusual value, not the bigger number. A 1450 SAT (z = +2) is more impressive than a 90 on our exam (z = +2 too — equally unusual; you can't rank the raw numbers, only the z's). - ❌ "The area under the curve at one exact point is the chance of that value."
✅ Cure: for a smooth density curve, area lives in a range, not at a point. "Exactly 80.000…" has area 0; we always ask about above / below / between. (Keep this light at intro level — just enough that "P(exactly x)" doesn't trip them.)
Interaction — Think-Pair-Share (empirical-rule reading, ~12 min):
Put N(70, 10) and its ruler (40–100) on a slide. Students answer solo (30 sec), compare with a neighbor (1 min), class votes:
1. What % score between 60 and 80? (68%)
2. What % score above 90? (2.5%)
3. A score of 50 is how many SDs below the mean, and is it unusual? (2 SDs below; yes — bottom 2.5%)
4. Between 50 and 70? (half of 95% = 47.5%)
Debrief #4, which always splits the room: "between 50 and 70" is the lower half of the within-2SD band → 95%/2 = 47.5%, because the mean cuts the 95% band in two.
Segment 5 — Standardizing: the z-score (25 min) · Session 2 opens
Hook back in: "Friday we read whole bands off a bell. But what about a score of 84, which doesn't sit on a nice tick mark? For that we need one universal ruler — the z-score."
Plain language first — what a z-score is.
- A z-score answers one question: how many standard deviations is this value above or below the mean? Positive = above the mean, negative = below, zero = right at it.
- The recipe: z = (value − mean) ÷ standard deviation = (x − μ) ÷ σ. Subtract the mean to recenter at 0; divide by σ to rescale into "SD units."
- Why it's powerful: it strips the units. Inches, dollars, points, hours — all become the same scale, so any two values from any two normal worlds can be compared directly. The world a z-score lives in is the standard normal: N(0, 1), mean 0 and SD 1.
Memory hook: "z = how many SDs from average. Subtract the mean, divide by the SD."
One fully worked example (test scores N(70, 10), pre-computed):
A student scores x = 80. Standardize:
z = (80 − 70) ÷ 10 = 10 ÷ 10 = +1.0.
In words: "80 is exactly one standard deviation above the class mean." From the empirical rule that already tells us about 84% of the class scored at or below them.
Now x = 50: z = (50 − 70) ÷ 10 = (−20) ÷ 10 = −2.0 → "two SDs below the mean — down in the unusual tail."
A second, different-units worked example (the punchline of the week — comparing across worlds):
Which is more impressive: a height of 71.5 inches in a population with μ = 64 in, σ = 2.5 in, or a coffee price of $3.80 at shops with μ = $3.00, σ = $0.40?
- Height: z = (71.5 − 64) ÷ 2.5 = 7.5 ÷ 2.5 = +3.0.
- Price: z = (3.80 − 3.00) ÷ 0.40 = 0.80 ÷ 0.40 = +2.0.
- The height (z = +3) is the more extreme value — 3 SDs out vs. 2 — even though "71.5" and "3.80" can't be compared as raw numbers. The z-score is the great equalizer.
The interpretation drill to give students: never stop at the number — say the z in a sentence. "z = +1.5 means this value is one-and-a-half standard deviations above average." Digits aren't the answer; the sentence is.
Segment 6 — From a z-score to an Area / Percentile (24 min)
Plain language first. A z-score tells you how far out you are; the table turns that into what share of the world is below you — your percentile — and into any tail or "between" area you need.
The three moves (the entire toolkit — put the small table on the slide):
- Area to the LEFT of z = read it straight off the table. This is the percentile (× 100).
- Area to the RIGHT of z = 1 − (area to the left). The "what fraction beats this?" answer.
- Area BETWEEN two z's = (area left of the bigger) − (area left of the smaller).
We hand students the table (above). Every problem is engineered to land on z = 0, ±0.5, ±1, ±1.5, ±2, ±2.5, ±3, so there is never a value to recall or interpolate. If a problem ever needs an off-table z, the instructor states the area outright — the table is supplied, never guessed.
One fully worked example (test scores N(70, 10), every step shown):
Question: what percentile is a score of 80, and what fraction of the class beat it?
1. Standardize: z = (80 − 70) ÷ 10 = +1.0.
2. Area to the left of z = +1.0 (from the table) = .8413. So 80 is the 84th percentile — about 84% scored at or below it.
3. Area to the right = 1 − .8413 = .1587 → about 15.87% of the class scored higher.
A second worked example — a "between" area (same N(70, 10)):
What share scored between 60 and 90?
- 60 → z = (60 − 70) ÷ 10 = −1.0; 90 → z = (90 − 70) ÷ 10 = +2.0.
- Between = (left of +2.0) − (left of −1.0) = .9772 − .1587 = .8185 → about 81.85% of the class.
A third — percentile back to a value (reverse direction):
What score sits at the 97.72nd percentile? Find the z whose left-area is .9772 → that's z = +2.0. Then un-standardize: value = μ + z·σ = 70 + 2(10) = 90. "A 90 is the 97.72nd-percentile cutoff — beat it and you're in the top ~2.3%."
Spreadsheet preview (one line, sets up the tech segment): "Software does this without the table: in Sheets, =NORM.DIST(80, 70, 10, TRUE) returns .8413 directly. We'll run it next — but you must be able to read the table by hand, because that's how you'll catch a wrong machine answer."
Segment 7 — Assessing Normality, Technology & AI-Critique (16 min)
Plain language — don't use the bell when the data aren't one.
- The normal model and the empirical rule are only trustworthy when the data are roughly normal: a histogram that's reasonably symmetric and single-peaked, no big skew, no wild outliers.
- Quick checks (no formulas needed at intro level): (1) sketch the histogram — does it look bell-ish? (2) compare mean and median — far apart signals skew; (3) eyeball whether the 68–95–99.7 bands roughly hold. If it's clearly skewed (incomes, home prices, response times) or lumpy/bimodal, say so and stop — z-scores and the empirical rule will lie.
- Callback to Weeks 2–3: "skew is named for the tail," and "the mean chases the outlier while the median resists." Those same instincts are how you reject a normal model here.
Technology workflow — normal areas in a spreadsheet (exact steps):
1. Area to the left of a value: =NORM.DIST(x, mean, sd, TRUE). Example: =NORM.DIST(80, 70, 10, TRUE) → 0.8413 (matches the table at z = +1).
2. Area to the right: =1 - NORM.DIST(x, mean, sd, TRUE). For 80: =1-0.8413 → 0.1587.
3. A value from a percentile (inverse): =NORM.INV(0.9772, 70, 10) → 90 (the 97.72nd-percentile score).
4. Standardize directly: =STANDARDIZE(80, 70, 10) → 1 (that's the z-score). Google Sheets and Excel are identical here.
AI-critique moment (students verify, not consume):
Paste this to an approved chatbot: "A normal distribution has mean 70 and SD 10. What proportion of values are below 80?"
Then check it against the table. The correct answer is .8413 (z = +1). Chatbots frequently (a) misread the empirical rule and answer "68%" or "84%" sloppily, (b) confuse the left area with the right, or (c) state a made-up table value to four decimals that's simply wrong. Your job all semester: the tool drafts, you judge — and a supplied table beats a recalled one every time. This is exactly how the weekly Lecture Tutorial works — you'll catch the model, not trust it.
Segment 8 — Callback, Hand-off & Tease (10 min) · Session 2 closes (~75)
Callback (tie the week together in three lines):
- Density curve → area is a proportion; the mean splits a normal in half.
- Empirical rule → 68–95–99.7 reads whole bands off a bell on sight.
- z-score + table → turns any value into a percentile and lets you compare across different worlds. "A raw number is a stranger; the z-score introduces it."
Tease next week: "We've described one bell. Next week the bell shows up where you'd least expect it — not in the data, but in the averages of samples. The Central Limit Theorem says sample means go normal even when the data don't, and that's the doorway to everything inferential. Same z-score, brand-new power."
Hand-off (the week's graded work):
- Lecture Tutorial 9 (AI tutor, share-link submission) — density curves, the empirical rule, z-scores, normal areas/percentiles, assessing normality.
- Practice 9 (ungraded reps) — floor-difficulty empirical-rule and z-score drills.
- Quiz 9 (end of week) — 10 items across the empirical rule, z-scores, areas/percentiles, and assessing normality.
- Discussion 9 ("How unusual is this value?" — pick a real test score, height, or price and judge it with z-scores) and Assignment 9 (four coached problems; submit the AI's self-scored report + chat link).
Instructor FAQ — Common Stumbles
| Student says / does | Quick cure |
|---|---|
| "Is 80 a good score?" (raw number, no context) | You can't judge a value until you know μ and σ. Standardize it: z = (80 − 70)/10 = +1 → one SD above average, ~84th percentile. The z-score is the context. |
| Forgets the empirical-rule tails. | Subtract from 100% and split evenly: outside ±1σ is 32% → 16% each tail; outside ±2σ is 5% → 2.5% each tail. Symmetry does the splitting. |
| "A negative z is an error." | No — negative just means below the mean. z = −2 is "two SDs below," a perfectly valid (and unusual-on-the-low-side) value. |
| Reports the left area when the question asks "above." | "Above" = right area = 1 − (left). Drawing the curve and shading the asked-for side prevents this every time. |
| Mixes up "between" with a single tail. | "Between two values" = (left of the bigger z) − (left of the smaller z). Shade the middle strip and it's obvious. |
| Tries to find P(exactly x). | For a smooth curve, a single point has area 0 — always ask above/below/between a value, never "exactly." |
| Applies 68–95–99.7 to skewed data (income, home prices). | First test normality: symmetric and bell-shaped? If it's skewed or has a long tail, the rule doesn't apply — say so and stop. |
| Ranks two values by raw size across different distributions. | You can only compare them via z-scores. The larger z is the more unusual value; the raw numbers aren't comparable. |
| Trusts a chatbot's four-decimal table value. | The student materials supply the table. A recalled table value is a guess — check every machine answer against the embedded table (and NORM.DIST if unsure). |
Scope flag
This outline stays within Objective 5. The point-has-area-zero remark, the NORM.INV/STANDARDIZE spreadsheet functions, and the formal mean-vs-median normality check are added context (not strictly required by the objective) — kept because they prevent the predictable misreadings and set up Week 10. Trim them for a leaner 60-minute version; the empirical rule, the z-score, and the three table moves are the non-negotiable core.
~ Prof. Rivera's edition · Fall 2026 · built with thecoursemaker.com