Week 16 · Lecture outline

Week 16 — Lecture Outline · Final Review & Exam

Introduction to Statistics · MATH 11 Fall 2026 · Prof. Rivera Fictional sample

Course: Introduction to Statistics (MATH 11) · Silver Oak University (fictional sample) · Prof. Rivera
Objectives covered: cumulative — Objectives 1–8 (Weeks 1–15). Obj 1 — populations vs. samples, sampling & design; Obj 2 — summarize & display one variable (shape, center, spread); Obj 3 — relationships between two variables; Obj 4 — probability rules, conditional probability, random variables (binomial & normal models); Obj 5 — the normal distribution & sampling distributions (z-scores, the CLT); Obj 6 — confidence intervals (means & proportions); Obj 7 — hypothesis tests (means & proportions); Obj 8 — simple linear regression & inference for the slope.
SLOs touched: A (reason quantitatively from data) · B (communicate results to a non-technical audience)
Meeting pattern: 2 sessions × 75 min = 150 min. Segment minutes below total ~150; scale to your own pattern.

This is the final review-and-exam week — no new content. It is cumulative over the entire course (Weeks 1–15, Objectives 1–8). Each segment briskly re-teaches one or two objectives with a quick worked example and the single misconception most likely to cost points; the final segment frames the comprehensive Final and how to prepare. Built to be taught from cold as a capstone review: an instructor (or a substitute) can run it without having taught the course, because every definition, number, and cure travels with the segment. This week's only graded item is the Final (30%) — there is no quiz, no discussion, and no assignment; the Final stands in for all of them. The Final pairs with a Study Guide + Exam-Prep Tutorial + Practice Final, built separately and referenced here by name.

Week at a Glance


The week's big question	"Across the whole course — getting data, describing it, relating it, reasoning about chance, and using a sample to make a confident claim about a population — what is the one honest move each topic asks of us, and where does everyone slip?"
By the end of the week, students can…	(1) re-run each objective's core move on demand — classify a variable and a sampling method (Obj 1); summarize a dataset's shape/center/spread and pick the honest number (Obj 2); read a scatterplot/correlation without sliding into causation (Obj 3); apply the probability rules and find an expected value, incl. the binomial setting (Obj 4); turn a z-score into a normal probability and use the Central Limit Theorem for the distribution of x̄ (Obj 5); build and read a confidence interval for a mean or proportion (Obj 6); run a hypothesis test and decide by comparing p to α (Obj 7); read a regression line — slope, intercept, r² — and test whether the slope is real (Obj 8); (2) name and avoid the highest-cost misconception in each theme; (3) walk into the Final knowing its coverage, its weight (30%), and a concrete plan built around the Study Guide, the Exam-Prep Tutorial, and the Practice Final.
Key vocabulary (all review)	population/sample, parameter/statistic, NOIR, SRS / stratified / cluster / systematic, bias, observational vs. experiment, confounding; histogram, shape/skew, outlier, mean/median/mode, SD, five-number summary, IQR, resistant measure; scatterplot, correlation r, lurking variable; sample space, probability rules (complement, addition, multiplication), conditional probability P(A\|B), independence, random variable, expected value E(X), binomial B(n, p) & BINS, normal model & 68–95–99.7; z-score, density curve, sampling distribution, Central Limit Theorem (CLT), standard error; confidence interval, margin of error, confidence level, t-distribution; hypothesis test, H₀ / Hₐ, p-value, significance level α, Type I / II error; least-squares line ŷ = b₀ + b₁x, slope, intercept, residual, r², inference for the slope
Materials	slides (Deck 16 — the final-review deck), the Study Guide, the Exam-Prep Tutorial (AI), the Practice Final, a spreadsheet (Google Sheets or Excel), one approved chatbot (Gemini / Claude / ChatGPT) for the audit-the-AI review moment
Timing note	8 segments, ~150 min total. Session 1 (Tue) = Segments 1–4 (~75): the map + Objectives 1–4 (describe → relate → chance). Session 2 (Thu) = Segments 5–8 (~75): Objectives 5–8 (distributions → intervals → tests → regression) + the Final frame. Scale to your own pattern.

Segment 1 — Hook & the Map of the Whole Course (10 min) · Session 1 opens

Hook. Put one sentence on the board with no numbers: "A new study of 900 students claims a tutoring app raises exam scores." Ask the room: "Before you believe one word of that — what do you want to know?" Let them fire questions. Steer the harvest: Who was measured, and how? What was recorded? What's the typical score, and how spread out? Is the link real or could a third thing explain it? How sure are they — is the difference bigger than chance?
- "Every question you just asked is one objective of this course. Sixteen weeks, eight objectives, and they line up into a single story. Today we walk the whole story once, fast, and find the exact spot in each chapter where points get lost. That's the Final."

The promise (write it on the board): "By Thursday you'll be able to take any of the eight skills — get data, describe it, relate it, reason about chance, work the normal curve, build a confidence interval, run a test, fit a line — and on demand state the one honest move it requires and the one mistake that sinks it."

The map (one slide, say it out loud — this is the photograph slide of the week):

DESCRIBE the world: Obj 1 GET the data · Obj 2 summarize ONE variable · Obj 3 RELATE two variables.
MODEL the chance: Obj 4 probability & random variables · Obj 5 the normal & sampling distributions.
INFER beyond the sample: Obj 6 confidence intervals · Obj 7 hypothesis tests · Obj 8 regression & inference for the slope.

Why it matters line (memory hook): "The whole course is one arc — describe what you see, model the chance behind it, then use a sample to make a confident claim about a population you never fully measured."

Segment 2 — Objectives 1 & 2 Review: Get the Data, Describe One Variable (20 min)

Re-teach Obj 1 in plain language. A population is everyone the question is about; a sample is the part we measured. A number describing the population is a parameter (p, μ); the matching sample number is a statistic (p̂, x̄) — "the hat means measured, not true." Two questions decide trust: how were they picked (SRS / stratified / cluster / systematic = trustworthy; convenience / voluntary response = traps) and what kind of variable was recorded (NOIR — nominal, ordinal, interval, ratio).

Re-teach Obj 2 in plain language. Describe one variable three ways: shape (from a histogram — symmetric or skewed, plus outliers), center (mean / median / mode), and spread (SD or IQR). The pairing rule: symmetric, no big outliers → mean + SD; skewed or outliers → median + IQR (both resistant to the outlier).

One quick worked example (do every step out loud):

Quiz scores: 2, 4, 4, 5, 5, 5, 6, 7, 9, 53 (the 53 is a typo for ~5 — leave it in to make the point).
- Mean = 100/10 = 10.0; median = (5 + 5)/2 = 5.0. The lone 53 drags the mean to 10 while the median sits calmly at 5.
- Shape is right-skewed by an outlier → report median + IQR. Five-number summary: min 2, Q1 = 4, median 5, Q3 = 7, max 53 → IQR = 3.

Highest-cost misconception + cure:
- ❌ "If it's a number, it's quantitative," and "always report the average."
✅ Cure: a zip code or jersey number is a nominal label — you can't average it (the test is "does arithmetic mean anything?"). And under skew or an outlier the mean is the dishonest one — switch to the median. "The mean chases the outlier; the median ignores it." (And size never fixes bias: Literary Digest 1936 had 2.4M responses and called the wrong winner — method beats size.)

Segment 3 — Objective 3 Review: Relating Two Variables (16 min)

Hook back in: "We've described one variable. Now two at once — and the single most expensive mistake in statistics lives here."

Re-teach in plain language. For two quantitative variables, a scatterplot shows the relationship and the correlation r measures the strength and direction of the straight-line part. r runs from −1 to +1: sign = direction, magnitude = strength; r ≈ 0 means no linear pattern (a clean U-shape can have r ≈ 0 and still be a strong curved relationship). For two categorical variables, use a two-way table. The headline: a strong r is a link, not a push — only a randomized experiment can claim cause; an observational correlation can always hide a lurking variable.

One quick worked example (read it, don't compute it):

Headline: "Students who drink more coffee get higher grades — r = 0.6." Observational.
- r = 0.6 is a moderate positive linear link — it does not say coffee raises grades.
- Lurking variable: hours studying could drive both the coffee and the grades; nothing was randomly assigned, so the arrow is unproven. Honest report (SLO B): "They rise together here, but a third factor like study time may explain both."

Highest-cost misconception + cure:
- ❌ "Strong correlation proves X causes Y," and "r = 0 means no relationship at all."
✅ Cure: ask "was anything randomly assigned?" — if no, it's a link, not a cause; and r only sees the straight-line part, so a clear curve can hide under r ≈ 0. "Correlation is a handshake, not a push."

Segment 4 — Objective 4 Review: Probability & Random Variables (29 min) · Session 1 closes (~75)

Re-teach the rules of chance in plain language. Probability is "how surprised should I be?" — always between 0 and 1. Four moves carry the unit:
- Complement: P(not A) = 1 − P(A) (the fast tool for "at least one").
- Addition (OR): P(A or B) = P(A) + P(B) − P(A and B) — subtract the overlap so you don't double-count (overlap is 0 for mutually exclusive events).
- Conditional & multiplication: P(A and B) = P(A) · P(B | A); events are independent when P(B | A) = P(B).

Random variables, binomial & normal. A random variable attaches a number to a chance outcome; discrete = a countable list, continuous = any value in a range. The expected value E(X) = Σ [value × probability] is the long-run average (a weighted average, not the single most likely value). Two working models: the binomial B(n, p) — recognize it by BINS (Binary outcome, Independent trials, fixed N, Same p), with E(X) = n·p and SD = √(n·p·(1−p)); and the normal model, the bell curve summarized by 68–95–99.7.

One quick worked example (compute it live):

A free-throw shooter makes 70%; she takes n = 10, shots independent → B(10, 0.7) (BINS all check).
- E(X) = n·p = 10 × 0.7 = 7 makes; SD = √(10 · 0.7 · 0.3) = √2.1 ≈ 1.45. One-line read: "about 7, give or take ~1.5."
- Complement tie-back: P(makes at least one) = 1 − P(misses all 10) = 1 − 0.3¹⁰ ≈ 0.9999941.

Highest-cost misconception + cure:
- ❌ "Just add the two probabilities," "expected value is the most likely value," and "any 'how many' is binomial."
✅ Cure: for OR, subtract the overlap; for AND, multiply only if independent (otherwise use P(B | A)). E(X) is a long-run average and can be a value X never takes (E = 7.5 makes is fine). And binomial needs all of BINS — drawing without replacement breaks "constant p / independent."

Quick interaction (think-pair-share, ~6 min): put four one-liners on a slide; students call the move (complement / addition / multiplication / binomial) solo (30 s), neighbor (1 min), vote by fingers. (e.g., "P(at least one defect in 5)?" → complement; "P(king or heart)?" → addition with overlap; "how many heads in 12 flips?" → binomial.)

Segment 5 — Objective 5 Review: The Normal & Sampling Distributions (22 min) · Session 2 opens

Hook back in: "Session 1 we got data, described it, and learned the rules of chance. Now the bridge to inference — and it rests on two ideas: the z-score and the Central Limit Theorem."

Re-teach in plain language.
- A z-score says how many standard deviations a value sits from the mean: z = (x − μ) / σ. Positive = above the mean, negative = below. Once a value is a z-score, the normal table (or a spreadsheet) turns it into a probability / percentile — the share of the curve below it.
- The 68–95–99.7 rule is the quick version: about 68% within 1 SD, 95% within 2, 99.7% within 3.
- The Central Limit Theorem (CLT) is the engine of the back half: if you take a large enough sample, the distribution of the sample mean x̄ is approximately normal, centered at the true mean μ, with a standard error = σ / √n — even when the population itself isn't normal. Larger n → tighter sampling distribution.

One quick worked example (compute it live):

Adult resting heart rate is roughly normal, μ = 70, σ = 10 bpm.
- A reading of 85: z = (85 − 70) / 10 = 1.5 — one and a half SDs above average.
- Sample of n = 100 people: the sample mean x̄ is approximately normal with mean 70 and standard error = 10 / √100 = 1.0 — so sample means cluster far more tightly than individuals (SD 10 vs. SE 1). That shrinking spread is exactly what makes a confidence interval narrow.

Highest-cost misconception + cure:
- ❌ "The CLT says the data becomes normal as n grows," and "standard error and standard deviation are the same thing."
✅ Cure: the CLT is about the distribution of the sample mean x̄, not the raw data — the population keeps its own shape. And the standard error = σ/√n is the SD of the sample mean; it shrinks as n grows, while the population SD stays put. "Means are calmer than individuals — and the bigger the sample, the calmer they get."

Segment 6 — Objective 6 Review: Confidence Intervals (20 min)

Re-teach in plain language. A confidence interval (CI) is an honest range for an unknown parameter, built from your sample, in the form estimate ± margin of error. The margin of error = (a critical value) × (the standard error); a 95% confidence level is the usual default.
- For a mean: x̄ ± t* · (s/√n) — we use the t-distribution (slightly wider than normal) because we estimate σ with the sample s.
- For a proportion: p̂ ± z* · √(p̂(1−p̂)/n).
- What "95% confident" actually means (the move the exam tests): if we repeated the whole sampling process many times, about 95% of the intervals we build would capture the true parameter. It is not "a 95% probability the parameter is in this interval" — the parameter is fixed; the interval is what varies.

One quick worked example (supplied numbers):

A sample of n = 100 students has mean study time x̄ = 14 hours, and the margin of error works out to ±1.0 hour at 95%.
- 95% CI: 14 ± 1.0 = (13.0, 15.0) hours. Read it (SLO B): "We're 95% confident the true average study time for all students is between 13 and 15 hours."
- Width drivers: a bigger n or a lower confidence level → narrower interval; higher confidence (99%) → wider. More confidence costs precision.

Highest-cost misconception + cure:
- ❌ "There's a 95% chance the true mean is in this interval," and "a wider interval is always better."
✅ Cure: the confidence is in the method over many samples, not in this one interval — the parameter isn't random. And width is a trade-off: more confidence or a smaller sample widens it; you don't want it as wide as possible, you want it honest and useful. "The interval is a net we cast; 95% of nets like this catch the fish."

Segment 7 — Objectives 7 & 8 Review: Hypothesis Tests & Regression (24 min)

Re-teach Obj 7 in plain language. A hypothesis test weighs evidence against a default claim.
- H₀ (null): the status-quo / "no effect" claim (e.g., μ = 70, or the two groups are equal). Hₐ (alternative): what we suspect instead.
- We compute a test statistic and a p-value = the probability of data this extreme if H₀ were true. Then the one rule: compare p to the significance level α** (usually 0.05).

p < α → reject H₀ (statistically significant — the effect is real evidence). p ≥ α → fail to reject H₀ (not enough evidence; could be chance).
- Errors: a Type I error = rejecting a true H₀ (a false alarm, rate α); a Type II error = failing to reject a false H₀ (a miss).

Re-teach Obj 8 in plain language (it's a hypothesis test in disguise). The least-squares line ŷ = b₀ + b₁x summarizes a linear relationship: slope b₁ = the per-one-unit change in ŷ (with units!), intercept b₀ = ŷ when x = 0, and r² = the share of the variation in y the line explains. Inference for the slope asks is the slope really different from 0? — H₀: slope = 0 (a flat line), tested by the same p vs. α rule.

One quick worked example (output supplied — read it, don't compute it):

Regression of exam score on weekly study hours (data from 1–6 hours) reports ŷ = 50 + 4x, r² = 0.99, slope t = 11.8, p = 0.001; use α = 0.05.
- Slope = 4: "+4 points of score per extra hour studied." Intercept = 50: predicted score at 0 hours (borderline — flag as extrapolation). r² = 0.99: ~99% of score variation is explained by hours.
- Inference: p = 0.001 < 0.05 = α → reject H₀ → the slope is statistically significant (the relationship is real, not noise). Two cautions: it's observational (a link, not a cause) and only valid inside 1–6 hours (don't extrapolate to 40 → ŷ = 210 is nonsense).

Highest-cost misconception + cure:
- ❌ "A small p-value proves H₀ is false / proves causation," "fail to reject means H₀ is true," and "significant means large."
✅ Cure: the p-value assumes H₀ and measures surprise, never proving cause; fail to reject means not enough evidence, not "H₀ proven true"; and significant = "probably not zero," not "big or important." For regression, a significant slope still needs an experiment (or a ruled-out lurking variable) before you claim cause.

Segment 8 — The Final Frame: What's On It & How to Prepare (15 min) · Session 2 closes (~75)

Audit-the-AI review moment (the course's recurring habit, one last time before the exam):

Paste to an approved chatbot: "A 95% confidence interval for the mean came out (13, 15). Is it true that there's a 95% probability the true mean is between 13 and 15? Also, a test gave p = 0.20 with α = 0.05 — does that prove there's no effect?"
Check it against what we taught. Chatbots often (1) endorse the "95% probability this interval" wording — wrong, the confidence is in the method over many samples — and (2) read p = 0.20 as "proves no effect" — wrong, it's failure to reject, i.e. not enough evidence, not proof of H₀. The tool drafts; you judge. Catch both and you're ready.

What's on the Final (state it plainly — put it on the closing slide):
- Coverage: cumulative over the whole course — Weeks 1–15, Objectives 1–8. Getting data; describing one variable; relating two variables; probability & random variables (binomial & normal); the normal & sampling distributions (z-scores, the CLT); confidence intervals (means & proportions); hypothesis tests (means & proportions); and simple linear regression with inference for the slope. Weighted toward the back half (Obj 5–8) since the midterm already covered 1–4 — but the early objectives are tools the later ones use, so they're fair game.
- Weight & logistics: the Final is 30% of the course grade. The window opens Mon Dec 14 and the exam is due Fri Dec 18, 11:59 p.m. (end of finals). (There is no Quiz 16, no Discussion 16, and no Assignment 16 — the Final replaces all of them.)
- Format: a mix of classify / compute / interpret items in the spirit of the worked examples above — choosing the honest summary, reading a correlation, applying a probability rule, computing a z-score, reading a confidence interval, deciding a test by p vs. α, and interpreting a regression line.

The preparation plan (point at each artifact by name):
1. Study Guide — work it first; it's the checklist of every move across the eight objectives.
2. Exam-Prep Tutorial — run it with an approved chatbot (Gemini / Claude / ChatGPT) and submit the share link; it drills your weak spots adaptively.
3. Practice Final — sit it timed, like the real thing, then review every miss against the Study Guide.

Callback + send-off:
- Callback: "Every item on the Final is a move you already made this term — Week 1 you learned to interrogate a number before believing it, and that instinct runs through all eight objectives: get good data, describe it honestly, relate it carefully, model the chance, and use a sample to make a confident, tested claim about a population you never fully measured."
- Send-off: "You don't need to cram everything — you need the eight honest moves and the mistake that sinks each one. Work the Study Guide, run the Exam-Prep Tutorial, take the Practice Final, then sit the Final. You've built every one of these skills. Go show them."

Hand-off (the week's work): review the Study Guide, run the Exam-Prep Tutorial (submit the share link), take the Practice Final, and sit the comprehensive Final (window opens Mon Dec 14; due Fri Dec 18). No quiz, discussion, or assignment this week — the Final is the whole grade for the module.

Instructor FAQ — Common Stumbles (Final-Review Week)

Student says / does	Quick cure
"Which do I report — mean or median?"	Look at the shape first. Skew or an outlier → median + IQR. Roughly symmetric, no outlier → mean + SD. The picture decides.
Calls a strong correlation "proof" of cause.	Ask: was anything randomly assigned? No → a link, not a cause. Hunt the lurking variable. (Same warning returns in regression.)
"The CLT means my data turns normal if n is big."	No — the sample mean x̄ becomes approximately normal; the raw population keeps its shape. And the spread of x̄ is the standard error σ/√n, which shrinks with n.
Confuses standard error with standard deviation.	SD describes the spread of individuals; SE = σ/√n describes the spread of the sample mean and gets smaller as n grows. Means are calmer than individuals.
"There's a 95% chance the true mean is in this interval."	The confidence is in the method over many samples: ~95% of intervals built this way capture the parameter. This interval either does or doesn't — the parameter is fixed.
"A wider confidence interval is better."	It's a trade-off. More confidence (99%) or a smaller sample → wider; bigger n → narrower. Aim for honest and useful, not maximally wide.
"p ≥ α proves H₀ is true."	Fail to reject ≠ proven true. It means not enough evidence to reject — H₀ survives, unproven. Only "reject" is a strong conclusion.
Reads "statistically significant" as "large / important."	Significant = probably not zero (p < α). A tiny effect can be significant with a big sample; a big effect non-significant with a small one.
"r² is the slope."	Different jobs: slope = per-unit change in ŷ (carries units); r² = unitless share of variation explained (0–1). Slope = how much; r² = how well.
Plugs a far-out x into a regression line (40 hrs → 210).	Extrapolation. The line is trustworthy only inside the data's x-range (~1–6 hrs); outside it the prediction is unsupported (and 210 on a 100-pt exam is impossible).
Panics that the Final is "literally everything."	It's the eight honest moves, not a thousand facts. The back half (Obj 5–8) leans heaviest; the early objectives are the tools the later ones use. Work the Study Guide → Exam-Prep Tutorial → Practice Final, in that order.

Scope flag

This outline is pure review of Objectives 1–8 — no new material. The framing extras (the "describe → model → infer" three-act map, the recurring Literary Digest / coffee-and-grades / study-hours callbacks, and the BINS mnemonic) are retained context carried over from the term because they make the cures stick; cut them for a leaner 60-minute review. The Final and its bundle (Study Guide, Exam-Prep Tutorial, Practice Final) are built separately and only referenced here by name. No quiz, discussion, or assignment is built for Week 16 — by the course spine, discussions run every week except W16, and exam weeks replace the quiz and assignment with the exam; the comprehensive Final is the module's only graded item.

~ Prof. Rivera's edition · Fall 2026 · built with thecoursemaker.com