Week 3 — Lecture Outline · Center & Spread
Course: Introduction to Statistics (MATH 11) · Silver Oak University (fictional sample) · Prof. Rivera
Objectives covered: Objective 2 — Summarize and display univariate data: describe its shape, center, and spread.
SLOs touched: A (reason quantitatively from data) · B (communicate results to a non-technical audience)
Meeting pattern: 2 sessions × 75 min = 150 min. Segment minutes below total ~150; scale to your own pattern.
Week at a Glance
| The week's big question | "When we squeeze a whole pile of numbers down to a single number, which number is honest — and what does it hide?" |
| By the end of the week, students can… | (1) compute and choose between the mean, median, and mode, and say which one fairly represents a dataset; (2) explain why the mean chases outliers while the median holds still, and pick the right one under skew; (3) compute a variance and a standard deviation on a small dataset and say what the SD means in plain words; (4) build the five-number summary (min, Q1, median, Q3, max), compute the IQR, and explain why median + IQR are resistant to outliers while mean + SD are not. |
| Key vocabulary | mean (average), median, mode, measure of center, skew (left/right), outlier, deviation, variance, standard deviation, population vs. sample formula (n vs. n−1), spread/variability, range, quartile (Q1, Q3), median (Q2), five-number summary, interquartile range (IQR), resistant / non-resistant measure |
| Materials | slides (Deck 3), the week's readings + video links, a spreadsheet (Google Sheets or Excel), one approved chatbot (Gemini / Claude / ChatGPT) for the AI-critique moment and the tutorial |
| Timing note | 8 segments, ~150 min total. Session 1 = Segments 1–4 (~75). Session 2 = Segments 5–8 (~75). |
Segment 1 — Hook & the Promise (8 min) · Session 1 opens
Hook. Put one sentence on the board: "The average person has slightly fewer than two arms." Let it land. It's true — a handful of people have lost an arm, nobody has three, so the mean dips just under 2 — and it's useless for picturing a normal person, who has exactly two.
- "A single number stood in for a whole group and quietly lied. That's the danger we're learning to control this week."
- Second jab: "If Jeff Bezos walks into a coffee shop, on average everyone inside is a billionaire." One giant value yanks the average somewhere no real customer lives.
The promise (write it on the board): "By the end of this week you can take any pile of numbers — incomes, test scores, wait times — and report the ONE number that tells the truth about the middle, plus the ONE number that tells the truth about the spread, and defend why you chose them."
Why it matters line (memory hook): "A summary is a promise. The mean and the median promise different things — and under a skew, only one of them keeps it."
Segment 2 — Measures of Center: Mean, Median, Mode (22 min)
Plain language first. Last week we turned a pile of numbers into a picture (a histogram). This week we go one step further and squeeze the pile into a single number — a measure of center, our best one-number answer to "what's typical here?" There are three.
- Mean = the ordinary average. Add up all the values, divide by how many there are. It's the balance point of the data.
- Median = the middle value once the numbers are sorted in order. Half the data sits below it, half above. (If there's an even count, average the two middle values.)
- Mode = the value that shows up most often. It's the only center that works for categorical data (most common major, most common blood type) and the only one that can be reported with words.
Memory hook (put it on a slide):
Mean adds and divides. Median is the middle of the line. Mode is the most. "Me-ди-an = mid; Mode = most."
One fully worked example (do every step out loud).
Five quiz scores (out of 10): 7, 8, 8, 9, 10.
- Mean: add them → 7 + 8 + 8 + 9 + 10 = 42. Divide by the count, 5 → 42 ÷ 5 = 8.4.
- Median: already sorted; with 5 values the middle is the 3rd one → 8. (Two below it: 7, 8. Two above: 9, 10.)
- Mode: which value repeats? 8 appears twice; everything else once → mode = 8.
Say each in words: "The average score is 8.4 out of 10; the middle student scored 8; the most common score was 8."
Notation preview (notation comes after the idea): the sample mean is written x̄ ("x-bar"); the formula is x̄ = (Σx) / n — "sigma x" just means "add up all the x's," and n is the count. "x-bar = total over how many."
Land the key idea: three different honest answers to "what's typical." On a symmetric pile they nearly agree. The drama starts when the pile is lopsided — that's Segment 3.
Segment 3 — Mean vs. Median: Skew, Outliers, and Which One to Trust (25 min)
Plain language first. The mean is a democracy where the loudest voter can buy the election. Every value pulls on the mean, and an extreme value pulls hard. The median doesn't care how far away the extreme is — it only cares about position, the middle of the line. So:
- Outlier = a value that sits far from the rest of the pack.
- A long tail to the right (right-skewed / positively skewed) drags the mean toward the tail, above the median. A long tail to the left does the reverse.
- The rule of thumb: mean follows the tail; median stays near the bulk.
One fully worked example (do every step out loud).
Five household incomes on a block (in thousands of dollars): 30, 35, 40, 45, 250. That last house is a mansion.
- Mean: 30 + 35 + 40 + 45 + 250 = 400; ÷ 5 = 80. The "average income on the block" is $80,000.
- Median: the middle of five sorted values is the 3rd → $40,000.
- Look what happened: four of the five households earn $45k or less, yet the mean says $80k. No real household on the block lives near $80k. The single mansion dragged the average above everyone but itself.
- Which is honest? The median, $40,000 — it describes a typical household. This is exactly why news reports use median home price and median household income, never the mean.
Memory hook: "The mean chases the outlier; the median ignores it." (Picture the mean on a leash, yanked toward the mansion; the median standing still in the middle of the street.)
Misconceptions + cures.
- ❌ "The mean is always the right 'average.'"
✅ Cure: the mean is right for symmetric data. Under skew or with an outlier, it reports a center where nobody lives. Match the measure to the shape.
- ❌ "The mean and median are basically the same number."
✅ Cure: on the income block they were $80k vs. $40k — a factor of two apart. The gap between them is itself a skew detector: mean ≫ median ⇒ right-skewed.
- ❌ "An outlier is just a mistake to delete."
✅ Cure: sometimes it's a typo, but often it's a real, important value (the mansion is real). Don't silently delete it — report a resistant measure instead, and mention the outlier.
Interaction — Think-Pair-Share (~6 min): Show two real-world quantities and ask, "mean or median — which would you report, and why?": (a) the salaries of 30 employees at a company whose CEO earns 50× everyone else; (b) the heights of 30 students. (Answers: (a) median — one huge salary skews the mean; (b) either is fine, heights are roughly symmetric, so report the mean.) Debrief: the shape decides the measure.
Segment 4 — Variance & Standard Deviation (22 min) · Session 1 closes (~75)
Plain language first — why center isn't enough. Two classes both average 75% on a test. In Class A every score is between 72 and 78. In Class B half scored 95 and half scored 55. Same center, completely different stories. Center tells you where the data sits; spread tells you how tightly it clusters there. The headline measure of spread is the standard deviation (SD) — roughly, the typical distance of a value from the mean.
Build the idea in pieces (chunk it — don't dump the formula).
1. A deviation is how far one value sits from the mean: (value − mean). Some are negative (below), some positive (above).
2. If you just add the deviations, they cancel to zero every time (that's what "balance point" means). So we square each deviation to kill the minus signs and punish big gaps extra.
3. The variance is the average of those squared deviations. The standard deviation is the square root of the variance — which un-squares the units, putting us back in the data's own scale (points, dollars, minutes).
One fully worked example (do every step out loud — keep the numbers tiny and friendly).
Five values: 2, 2, 4, 6, 6.
1. Mean: 2 + 2 + 4 + 6 + 6 = 20; ÷ 5 = 4.
2. Deviations (value − 4): −2, −2, 0, +2, +2. (Check: they sum to 0. ✓ Always do this — it catches arithmetic slips.)
3. Square each: 4, 4, 0, 4, 4. Sum of squares = 16.
4. Variance: divide the sum of squares by (n − 1) = 4 → 16 ÷ 4 = 4. (This is the sample variance, written s².)
5. Standard deviation: √4 = 2. So s = 2.
Say it in words: "A typical score sits about 2 units away from the mean of 4." That's the whole meaning of an SD.
The n vs. n−1 note (keep it light). When the data is the whole population, divide by n. When the data is a sample standing in for a bigger population — the usual case — divide by n − 1 (it corrects a slight under-estimate). Intro rule: sample SD divides by n − 1. We'll always tell you which one you have; don't agonize over it.
Misconception + cure.
- ❌ "Standard deviation tells you the center."
✅ Cure: no — the mean tells you the center; the SD tells you the spread around it. A big SD = scores scattered far; a small SD = scores bunched tight. Two datasets can share a mean and have wildly different SDs.
Segment 5 — Quartiles & the Five-Number Summary (25 min) · Session 2 opens
Hook back in: "Last session: the SD measures spread around the mean. But we just learned the mean can lie under skew — so its sidekick the SD can lie too. Today: a spread measure that doesn't flinch at outliers."
Plain language first. Instead of measuring spread from the mean, we can chop the sorted data into four equal-size quarters and report the cut points.
- The median (Q2) cuts the data in half.
- Q1 (first quartile) is the median of the lower half — 25% of the data sits below it.
- Q3 (third quartile) is the median of the upper half — 75% sits below it (25% above).
- Together with the smallest and largest values, these give the five-number summary: Min · Q1 · Median · Q3 · Max. It's the skeleton of the data, and it's exactly what a boxplot draws (we'll picture that next week's cousin).
One fully worked example (do every step out loud).
Seven students' commute times to campus, in minutes: 10, 12, 15, 18, 20, 22, 35. (Already sorted; the 35-minute commuter is our outlier.)
- Min = 10. Max = 35.
- Median (Q2): 7 values → the 4th is the middle → 18.
- Lower half (everything below the median): 10, 12, 15 → its middle is Q1 = 12.
- Upper half (everything above the median): 20, 22, 35 → its middle is Q3 = 22.
- Five-number summary: 10 · 12 · 18 · 22 · 35.
(Method note for odd counts: leave the overall median out of both halves. We'll always state the dataset so the cut points land cleanly.)
The IQR — spread that ignores the tails.
- Interquartile range (IQR) = Q3 − Q1. It's the width of the middle 50% of the data.
- For the commute data: IQR = 22 − 12 = 10 minutes. The middle half of commuters span a 10-minute window.
- Compare the plain range = Max − Min = 35 − 10 = 25. The range is at the mercy of that one 35-minute outlier; the IQR throws the extremes away and reports the spread of the typical middle.
Memory hook: "Q1 and Q3 are the median's two siblings — the median of the bottom half and the median of the top half. The IQR is the distance between the siblings."
Segment 6 — Resistance: Why Median + IQR Beat Mean + SD Under Skew (18 min)
Plain language: a measure is resistant (robust) if a single extreme value can't move it much. This is the punchline that ties the whole week together.
- Mean and standard deviation are non-resistant — both are computed from every value, so one outlier drags them.
- Median and IQR are resistant — both depend only on position (middle, quarter-points), so an outlier at the far end can't budge them.
One fully worked example — watch one number wreck two measures and leave two untouched.
Start: 20, 21, 22, 23, 24. Mean = 22, Median = 22. Calm and symmetric.
Now change the 24 to a wild 99 (a data-entry typo, or a real extreme): 20, 21, 22, 23, 99.
- Mean: 20 + 21 + 22 + 23 + 99 = 185; ÷ 5 = 37. The mean leapt from 22 to 37 — there's no value anywhere near 37.
- Median: the middle of the sorted five is still the 3rd value → 22. The median did not move at all.
- The same thing happens to spread: the SD explodes, while the IQR barely notices.
The lesson, in one line: "For skewed data or data with outliers, report the median and the IQR. For symmetric data, the mean and SD are fine — and a little more informative."
Callback: point back to Segment 3's income block — same phenomenon. And point back to Week 2: a histogram that looked right-skewed is your signal to reach for the median and IQR, not the mean and SD. Shape → choice of summary.
Misconception + cure.
- ❌ "Resistant measures are just 'safer,' so always use the median and IQR."
✅ Cure: not quite — when data is roughly symmetric with no outliers, the mean and SD use every value and carry more information. Resistance is a response to skew, not a blanket rule. Read the shape first.
Segment 7 — Putting It Together: A Full Numerical Summary (20 min)
Plain language first. A complete description of one quantitative variable answers three questions, in this order: shape → center → spread (the same trio that organizes Objective 2).
- Shape: from last week's histogram — symmetric, or skewed left/right? Any outliers?
- Center: symmetric ⇒ report the mean; skewed/outliers ⇒ report the median.
- Spread: report the spread measure that matches the center — SD with the mean, IQR with the median. (Keep the pair together.)
One fully worked example (a small real-feeling dataset, every step shown):
Daily cups of coffee sold at a tiny campus cart over 7 days: 18, 20, 22, 24, 26, 28, 60. (Friday game day spiked to 60.)
1. Shape: sorted, the values rise smoothly to 28, then jump to 60 — a clear right skew with an outlier.
2. Center: because it's skewed, report the median. Middle of 7 = 4th value = 24 cups. (For contrast, the mean is (18+20+22+24+26+28+60) = 198 ÷ 7 = 28.3 — pulled up by game day, above 5 of the 7 days.)
3. Spread: match the median with the IQR. Lower half 18, 20, 22 → Q1 = 20; upper half 26, 28, 60 → Q3 = 28; IQR = 28 − 20 = 8 cups.
Report it like a human (SLO B): "On a typical day the cart sells about 24 cups, and the middle half of days fall within an 8-cup band — though one game day spiked to 60." One sentence, no jargon, completely honest.
Memory hook: "Center and spread travel as a couple: mean rides with SD, median rides with IQR. Don't mix the partners."
Segment 8 — Technology Workflow + AI-Critique, Callback & Hand-off (12 min) · Session 2 closes (~75)
Technology workflow — center & spread in a spreadsheet (exact steps):
1. Put your data in column A (say A2:A8 for the 7 coffee values).
2. In separate cells, type:
- =AVERAGE(A2:A8) → the mean (28.29 for the coffee data).
- =MEDIAN(A2:A8) → the median (24).
- =STDEV.S(A2:A8) → the sample standard deviation (use STDEV.P only when the data is the whole population).
- =QUARTILE.INC(A2:A8,1) and =QUARTILE.INC(A2:A8,3) → Q1 and Q3; then =QUARTILE.INC(A2:A8,3)-QUARTILE.INC(A2:A8,1) for the IQR.
- =MIN(A2:A8) and =MAX(A2:A8) complete the five-number summary.
- Google Sheets and Excel use the same function names. (Heads-up: spreadsheets use a slightly different quartile rule than our hand method, so software Q1/Q3 can differ a touch from class — that's expected; we'll flag it.)
AI-critique moment (students verify, not consume):
Paste this to an approved chatbot: "For the data 30, 35, 40, 45, 250, what is the average income, and is the mean a good summary here?"
Then audit the answer. A chatbot will usually compute the mean correctly (80) — but watch whether it flags the skew. Many will hand you "$80,000" with no warning that four of the five values sit far below it. Your job: catch the missing caveat and demand the median instead. The number isn't wrong; the summary is misleading. The tool drafts; you judge whether it told the truth.
Callback + tease:
- Callback: "Week 2 gave us the shape; this week gave us the two honest numbers — center and spread — and taught us to match them to the shape."
- Tease next week: "We've fully described one variable. Week 4 asks the next question: when two variables move together — study hours and exam scores, price and demand — how do we measure the relationship? Scatterplots and correlation are next."
Hand-off (the week's graded work):
- Lecture Tutorial 3 (AI tutor, share-link submission) — mean/median/mode, mean-vs-median under skew, variance & SD, the five-number summary and IQR.
- Quiz 3, Discussion 3 (adaptive — "Which measure of center fairly represents this dataset?", an AI dialogue you summarize and post), and Assignment 3.
Instructor FAQ — Common Stumbles
| Student says / does | Quick cure |
|---|---|
| "Which 'average' do they mean — mean or median?" | In everyday speech "average" = mean. But the honest one depends on shape: symmetric → mean; skewed/outliers → median. Always ask what does the data look like? first. |
| Forgets to sort before finding the median or quartiles. | The median and quartiles are about position in the sorted line. Sort first, every time — an unsorted "middle value" is meaningless. |
| Adds the deviations and expects a nonzero spread. | The plain deviations always sum to zero (that's the balance point). That's why we square them — to stop the cancellation. Squaring is the whole trick. |
| Divides the variance by n for sample data. | Sample variance/SD divides by n − 1, not n. We'll tell you when you have the whole population (then use n). When in doubt with a sample, use n − 1. |
| Reports the mean with the IQR (or median with SD). | Keep the couple together: mean ↔ SD, median ↔ IQR. Mixing partners sends a confused signal about the data. |
| Thinks a bigger standard deviation means a higher center. | SD is about spread, not location. Two classes can both average 75% — one tight, one scattered. SD measures the scatter, nothing about the average itself. |
| "Just delete the outlier." | Don't silently delete real values. If it's a genuine typo, fix it; if it's a real extreme, keep it and report resistant measures (median, IQR), and mention it. |
| Confuses range and IQR. | Range = Max − Min (uses the two extremes, so an outlier wrecks it). IQR = Q3 − Q1 (the middle 50%, ignores the extremes). IQR is the resistant one. |
| Gets a different Q1/Q3 from the spreadsheet than from class. | Expected — software uses a slightly different quartile interpolation than our hand method. Both are "right"; just be consistent and say which you used. |
Scope flag
This outline stays within Objective 2 (center and spread for one variable). The n vs. n−1 distinction and the resistant/robust vocabulary are introduced lightly as added context — they're not strictly required to compute the week's measures, but they make the "which summary to trust" judgment stick. The boxplot itself is only teased here; it lives with two-variable displays. Cut the n−1 aside for a leaner 60-minute version.
~ Prof. Rivera's edition · Fall 2026 · built with thecoursemaker.com