Back to the Introduction to Statistics outline The Course Maker
Introduction to Statistics outline
Week 6 · Lecture outline

Week 6 — Lecture Outline · Random Variables

Introduction to Statistics · MATH 11 Fall 2026 · Prof. Rivera Fictional sample

Course: Introduction to Statistics (MATH 11) · Silver Oak University (fictional sample) · Prof. Rivera
Objectives covered: Objective 4 — Work with random variables: distinguish discrete from continuous, read a probability distribution, and compute the expected value and variance of a discrete random variable.
SLOs touched: A (reason quantitatively from data) · B (communicate results to a non-technical audience)
Meeting pattern: 2 sessions × 75 min = 150 min. Segment minutes below total ~150; scale to your own pattern.


Week at a Glance

The week's big question "If chance hands you a number — a payout, a count, a wait time — what should you expect on average, and how much will it bounce around?"
By the end of the week, students can… (1) tell a discrete random variable from a continuous one, and give an example of each; (2) read and check a discrete probability distribution (the probabilities are each between 0 and 1 and add to exactly 1); (3) compute the expected value E(X) of a discrete RV and say what it means in plain words; (4) compute the variance and standard deviation of a discrete RV and explain what the SD tells you about the spread of outcomes.
Key vocabulary random variable, discrete vs. continuous, probability distribution (probability mass function), valid distribution (0 ≤ P ≤ 1, ΣP = 1), expected value E(X) / mean μ of a random variable, weighted average, variance Var(X) / σ², standard deviation σ, E(X²), long-run average, fair game / expected value of a bet
Materials slides (Deck 6), the week's readings + video links, a spreadsheet (Google Sheets or Excel), one approved chatbot (Gemini / Claude / ChatGPT) for the AI-critique moment and the tutorial
Timing note 8 segments, ~150 min total. Session 1 = Segments 1–4 (~75). Session 2 = Segments 5–8 (~75).

Segment 1 — Hook & the Promise (8 min) · Session 1 opens

Hook. Hold up an imaginary scratch-off ticket. "This ticket costs $2. One in ten wins $10; the rest win nothing. Quick gut check — good deal or bad deal?" Take a show of hands. Most rooms split.
- "Here's the thing: 'good deal' isn't an opinion this week — it's a number we can compute. By the end of class you'll be able to put a single dollar figure on a bet, a warranty, or an insurance policy and say, with arithmetic, whether the house is winning."
- "Last week we learned the rules of probability — how likely each thing is. This week we attach a number to each outcome and ask the grown-up question: what should I expect on average, and how wild is the ride?"

The promise (write it on the board): "By the end of this week you can take any situation where chance produces a number — a payout, a count of defects, a number of no-shows — and report two things: the expected value (what it averages to in the long run) and the standard deviation (how much it swings), and use them to make a real decision."

Why it matters line (memory hook): "A random variable is chance wearing a number. Expected value is what you'd average if you played forever; standard deviation is how bumpy the ride is along the way."


Segment 2 — Random Variables: Discrete vs. Continuous (22 min)

Plain language first. A random variable is just a rule that attaches a number to the outcome of a random process. Flip three coins → the number of heads is a random variable. Drive to campus → your commute time in minutes is a random variable. We write it with a capital letter, usually X; a particular value it takes is a lowercase x.
- The whole point of a number (instead of a label like "heads/tails") is that now we can average it, add it, and measure its spread — exactly the Objective-2 machinery from earlier, now aimed at outcomes of chance.

The one split that organizes the whole week — two families:
- Discrete random variable: the possible values are separate, countable — you can list them, often whole numbers. Number of heads in 3 flips (0, 1, 2, 3). Number of defective phones in a box. Number of students who show up. The roll of a die. You count it.
- Continuous random variable: the possible values fill an entire interval — any value in a range, limited only by how precisely you can measure. Exact height. Exact weight. A commute time of 14.37… minutes. The amount of soda a machine pours. You measure it.

Memory hook (put it on a slide):

Discrete = you COUNT it (separate steps, often whole numbers). Continuous = you MEASURE it (any value on a ruler). "Count vs. ruler."

One fully worked example (classify and justify each):

Sort these six random variables into discrete or continuous:
- The number of text messages you send today → discrete (0, 1, 2, …; you count them).
- Your exact body temperature tonight → continuous (98.6, 98.63, …; any value on a range).
- The number of cars that pass a corner in an hour → discrete (a count).
- The time until the next bus arrives → continuous (measured, any value ≥ 0).
- A student's shoe size reported as 8, 8.5, 9, … → discrete (separate listed steps, even with the half-sizes).
- The weight of an apple → continuous (measured).

The "test" to give students: Could you, in principle, list the possible values one by one (even if the list is long)? If yes → discrete. Or do the values fill a whole interval on a ruler with no gaps?continuous. Are you counting, or measuring?

Scope note for the class: this week we compute with discrete RVs (we can list outcomes and their probabilities). Continuous RVs — heights, the normal curve — get their own machinery in Week 9; today we just learn to recognize them.


Segment 3 — The Probability Distribution of a Discrete RV (25 min)

Plain language first. A probability distribution for a discrete random variable is just a table (or rule) that pairs each value the variable can take with how likely that value is. It's the complete "map" of the random variable — every outcome and its probability.

The two rules that make a distribution valid (the gatekeeper for the whole week):
1. Every probability is between 0 and 1 (inclusive). No negative chances; nothing more than certain.
2. The probabilities add up to exactly 1. Something has to happen, so the chances of all the outcomes must total 100%.

If either rule fails, it is not a probability distribution — full stop. This is the first thing to check before computing anything.

One fully worked example (build and check a distribution).

Let X = the number of heads when you flip a fair coin twice. The equally likely outcomes are HH, HT, TH, TT.
- 0 heads: only TT → P(X=0) = 1/4 = 0.25
- 1 head: HT or TH → P(X=1) = 2/4 = 0.50
- 2 heads: only HH → P(X=2) = 1/4 = 0.25

x (heads) 0 1 2
P(X = x) 0.25 0.50 0.25
  • Check rule 1: every entry is between 0 and 1. ✓
  • Check rule 2: 0.25 + 0.50 + 0.25 = 1.00. ✓ It's a valid distribution.

A "find the missing probability" move (students love this one).

A distribution lists P = 0.2, 0.5, 0.1, and one blank. Since they must total 1, the blank is 1 − (0.2 + 0.5 + 0.1) = 1 − 0.8 = 0.2. "The last probability is whatever makes the column add to 1."

Misconceptions + cures.
- ❌ "Any table of numbers is a probability distribution."
Cure: only if it passes both gates — each P in [0, 1] and ΣP = 1. A table with a probability of 1.2, or one that sums to 0.9, is disqualified.
- ❌ "The x-values have to add to 1."
Cure: no — the probabilities (the bottom row) add to 1. The x-values are just the outcomes; they can be anything (−5, 0, 10, …).
- ❌ "A probability of 0 means I made a mistake."
Cure: 0 is a legal probability — it just means that outcome never happens. (And 1 is legal too: a sure thing.)


Segment 4 — Expected Value E(X) (22 min) · Session 1 closes (~75)

Plain language first — what "expected value" really means. The expected value of a discrete random variable, written E(X) (or the mean μ), is the long-run average value of X if you repeated the random process a huge number of times. It is not necessarily a value X can actually take — it's the balance point of the distribution, a weighted average where each outcome is weighted by its probability.
- Plain version: "Multiply each outcome by its chance, then add those products. That's what it averages to in the long run."
- Formula (notation comes after the idea): E(X) = Σ [ x · P(X = x) ] — "for every value, value times its probability, all summed."

One fully worked example (do every step out loud — keep the numbers friendly).

A small distribution. Let X take the values 0, 1, 2, 3 with these probabilities:

x 0 1 2 3
P(X = x) 0.1 0.3 0.4 0.2

(First, the gate: 0.1 + 0.3 + 0.4 + 0.2 = 1.00 ✓ — valid, so we may proceed.)
Multiply each value by its probability, then add:
- 0 × 0.1 = 0.0
- 1 × 0.3 = 0.3
- 2 × 0.4 = 0.8
- 3 × 0.2 = 0.6
- E(X) = 0.0 + 0.3 + 0.8 + 0.6 = 1.7

Say it in words: "Over many, many repetitions, X averages 1.7." Notice 1.7 isn't one of the listed values — and that's fine. Expected value is a long-run average, not a prediction of the next single outcome. (We reuse this exact distribution in Segment 5 to find the variance — keep it on the board.)

The fair-die sanity check (do it fast).

Roll one fair die: X = 1, 2, 3, 4, 5, 6, each with probability 1/6.
E(X) = (1 + 2 + 3 + 4 + 5 + 6) × (1/6) = 21 × (1/6) = 21/6 = 3.5.
"You can never roll a 3.5 — yet 3.5 is exactly the long-run average. That's expected value in one breath."

Memory hook: "Expected value = each outcome times its chance, all added up. It's where the distribution balances — not what you'll get next time."

Misconception + cure.
- ❌ "The expected value is the most likely outcome."
Cure: no — that's the mode. E(X) is the weighted average. On the die, every face is equally likely yet E(X) = 3.5, which can't even occur. Most-likely ≠ expected.


Segment 5 — Variance & Standard Deviation of a Random Variable (25 min) · Session 2 opens

Hook back in: "Last session: E(X) tells you what X averages to. But two bets can share the same expected value and feel completely different — one steady, one a roller coaster. Today: the number that measures the roller coaster."

Plain language first. Just like a dataset (Week 3), a random variable has a spread around its center. The variance Var(X) (also σ²) is the probability-weighted average of the squared distances from the mean; the standard deviation σ is its square root, which puts us back in the variable's own units. Big σ = outcomes swing far from E(X); small σ = outcomes hug the mean.

The compute-friendly route (chunk it — don't dump a scary formula). There are two equivalent ways; we'll use the shortcut that's easiest by hand:
1. Find E(X) (we already have it).
2. Find E(X²) — the same weighted-average move, but square each x first: E(X²) = Σ [ x² · P(X = x) ].
3. Variance = E(X²) − [E(X)]². (In words: "the mean of the squares minus the square of the mean.")
4. Standard deviation = √Variance.

One fully worked example (reuse Segment 4's distribution — every step shown).

Same X: values 0, 1, 2, 3 with P = 0.1, 0.3, 0.4, 0.2, and we already found E(X) = 1.7.
Step 2 — E(X²): square each value, weight by its probability, add.
- 0² × 0.1 = 0 × 0.1 = 0.0
- 1² × 0.3 = 1 × 0.3 = 0.3
- 2² × 0.4 = 4 × 0.4 = 1.6
- 3² × 0.2 = 9 × 0.2 = 1.8
- E(X²) = 0.0 + 0.3 + 1.6 + 1.8 = 3.7

Step 3 — Variance: Var(X) = E(X²) − [E(X)]² = 3.7 − (1.7)² = 3.7 − 2.89 = 0.81.
Step 4 — Standard deviation: σ = √0.81 = 0.9.
Say it in words: "X averages 1.7, and a typical outcome sits about 0.9 away from that average." That sentence — center and typical swing — is the whole point of the week.

The alternative "definition" route (mention, don't drill). You can also compute variance directly as Σ [ (x − μ)² · P(X = x) ] — the average squared distance from the mean. It gives the same 0.81 here; the E(X²) − [E(X)]² shortcut is just less arithmetic. (Worth showing once on the board for the curious: deviations −1.7, −0.7, +0.3, +1.3, squared and weighted, also total 0.81.)

Memory hook: "Mean of the squares minus the square of the mean. Then take the square root to get back to real units."

Misconception + cure.
- ❌ "Standard deviation tells you the center of a random variable."
Cure: no — E(X) is the center; σ is the spread around it. Two random variables can have the same E(X) and wildly different σ (a $5-or-nothing bet vs. a $500-or-nothing bet can share an expected value but not a standard deviation).
- ❌ "Forgot to subtract [E(X)]²." (The most common arithmetic slip.)
Cure: E(X²) alone is not the variance — you must subtract the square of the mean. Write μ² on the board before computing, so it's waiting to be subtracted.


Segment 6 — Expected Value as a Decision Tool: Is It a Good Deal? (18 min)

Plain language: here's where the week pays for itself. Whenever a choice has a random dollar outcome — a bet, a lottery ticket, an insurance policy, an extended warranty — you decide whether it's "worth it" by computing the expected value of the net gain. Positive E = it favors you on average; negative E = it favors the other side. Casinos, lotteries, and warranty sellers all stay in business because the expected value is on their side.

One fully worked example — back to the hook (the scratch-off).

The ticket costs $2. With probability 0.1 you win $10; with probability 0.9 you win $0. What's the expected value to you?
Work in net dollars (what you walk away with after paying the $2):
- Win: net = $10 − $2 = +$8, with probability 0.1.
- Lose: net = $0 − $2 = −$2, with probability 0.9.

Net outcome +8 −2
Probability 0.1 0.9

E(net) = (8)(0.1) + (−2)(0.9) = 0.8 − 1.8 = −$1.00.
Plain-language verdict (SLO B): "On average you lose about a dollar every time you play. Fun for a buck, maybe — but as a money decision, it's a bad deal. Play it 1,000 times and you'd expect to be down about $1,000."

The mirror image — why insurance can still be "worth it."

Insurance almost always has a negative expected value for you (that's the company's profit). Yet people rightly buy it — because a rare, catastrophic loss (a totaled car, a hospital bill) carries a standard deviation you can't survive. Expected value isn't the only thing that matters; sometimes you pay a small expected loss to crush a huge variance. This is exactly why E(X) and σ are a pair — the average and the risk.

Memory hook: "To judge a bet, compute the expected value of the net. Negative means the house wins on average — and it almost always does."

Quick mini-debate (genuinely arguable, ~4 min): "An extended warranty on a $400 phone costs $60 and pays out $400 only if the phone dies in two years (say a 1-in-10 chance). Good deal?" Have students compute E(net) = (340)(0.1) + (−60)(0.9) = 34 − 54 = −$20 and then argue: is the peace of mind worth a $20 expected loss? Surface that the answer depends on how much a $400 hit would hurt you — variance, not just expectation.


Segment 7 — Putting It Together: A Full Random-Variable Summary (20 min)

Plain language first. A complete description of a discrete random variable answers, in order: Is it discrete? → Is the distribution valid? → What's the center E(X)? → What's the spread σ? — the same describe-it discipline as Objective 2, now for chance.

One fully worked example (a small real-feeling scenario, every step shown):

A campus food cart sells a "mystery box." The number of items, X, in a box and its probabilities:

x (items) 1 2 3
P(X = x) 0.5 0.3 0.2
  1. Discrete? Yes — you count items (1, 2, 3). ✓
  2. Valid distribution? Each P in [0, 1], and 0.5 + 0.3 + 0.2 = 1.00. ✓
  3. Center, E(X): (1)(0.5) + (2)(0.3) + (3)(0.2) = 0.5 + 0.6 + 0.6 = 1.7 items.
  4. Spread, σ: E(X²) = (1)(0.5) + (4)(0.3) + (9)(0.2) = 0.5 + 1.2 + 1.8 = 3.5; Var = 3.5 − (1.7)² = 3.5 − 2.89 = 0.61; σ = √0.61 ≈ 0.78 items.
    Report it like a human (SLO B): "A mystery box holds about 1.7 items on average, give or take roughly 0.8 — so almost always one to three, centered near two." One honest sentence: center and swing, in plain words.

Memory hook: "Describe a random variable in four beats: discrete?, valid?, E(X)?, σ? — center and spread travel together, just like Week 3."

Callback: point back to Week 3 — E(X) is the mean and σ the standard deviation, the same two ideas as for a dataset, now computed from probabilities instead of a list of data. And back to Week 5: those probabilities came from the rules we learned last week. The course is stacking.


Segment 8 — Technology Workflow + AI-Critique, Callback & Hand-off (12 min) · Session 2 closes (~75)

Technology workflow — E(X), Var(X), and σ in a spreadsheet (exact steps):
1. Put the values x in column A (say A2:A5) and the probabilities P in column B (B2:B5).
2. In a spare cell, check the distribution: =SUM(B2:B5) — it must equal 1. (If it doesn't, stop — it's not a valid distribution.)
3. Expected value: =SUMPRODUCT(A2:A5, B2:B5)E(X). (SUMPRODUCT multiplies each x by its P and adds — the formula by hand, in one cell.)
4. E(X²): =SUMPRODUCT(A2:A5^2, B2:B5) (in Google Sheets you may need =SUMPRODUCT(A2:A5*A2:A5, B2:B5)) → the mean of the squares.
5. Variance: subtract the square of the mean — e.g. = (the E(X²) cell) − ( the E(X) cell )^2. Standard deviation: =SQRT( the variance cell ).
- Google Sheets and Excel use the same function names. For the Segment-4 distribution you should get E(X) = 1.7, E(X²) = 3.7, Var = 0.81, σ = 0.9 — verify the cells match the board.

AI-critique moment (students verify, not consume):

Paste this to an approved chatbot: "X takes values 0, 1, 2, 3 with probabilities 0.1, 0.3, 0.4, 0.2. Find the expected value and the variance."
Then audit the answer. Chatbots usually nail E(X) = 1.7, but the variance is where they slip: watch for one that reports 3.7 (that's E(X²) — it forgot to subtract [E(X)]²), or that divides by n as if it were a data list (there is no n here — outcomes are weighted by probability, not counted). The correct variance is 3.7 − 2.89 = 0.81, σ = 0.9. Your job all term: the tool drafts, you check the step it loves to skip.

Callback + tease:
- Callback: "Week 5 gave us the probability rules; this week we attached a number to chance and learned to report its average (E(X)) and its swing (σ) — and to price a bet."
- Tease next week: "We've done random variables in general. Week 7 meets the two that run the rest of the course: the binomial (counts of successes — number of heads, number of defects) and the normal model. Today's E(X) and σ are about to get famous shortcuts."

Hand-off (the week's graded work):
- Lecture Tutorial 6 (AI tutor, share-link submission) — discrete vs. continuous, valid distributions, E(X), and variance/SD of a random variable.
- Quiz 6, Discussion 6 (adaptive — "Is this game / lottery / insurance / warranty a good deal? Reason with expected value." — an AI dialogue you summarize and post), and Assignment 6.


Instructor FAQ — Common Stumbles

Student says / does Quick cure
"Is shoe size discrete or continuous? It has decimals." Decimals don't decide it — listability does. Shoe sizes come in separate listed steps (8, 8.5, 9), so you can count them → discrete. A true measurement (exact foot length) fills a whole interval → continuous. Ask: count or ruler?
Treats a table that sums to 0.9 (or has a P of 1.3) as a distribution. It isn't one. Both gates must pass: every P in [0, 1] and the column sums to exactly 1. Check the two rules before computing anything.
Adds the x-values to 1 instead of the probabilities. The bottom row (the probabilities) must total 1. The x-values are just the outcomes — they can be any numbers.
Thinks E(X) must be a value X can actually take. It's a long-run weighted average, not a possible outcome. A fair die has E(X) = 3.5, which you can never roll. "Expected" means averaged, not predicted.
Confuses expected value with the most likely outcome. Most-likely = the mode; expected value = the probability-weighted average. They're often different (every die face is equally likely, yet E(X) = 3.5).
Reports E(X²) as the variance. Variance = E(X²) − [E(X)]². You must subtract the square of the mean. Write μ² on the board first so you don't forget to subtract it.
Divides by n (or n−1) to get the variance, like a data set. There's no n for a random variable — outcomes are weighted by their probabilities, not counted. Use Σ x²·P − μ², never "÷ n".
Reports the variance as the final spread (units look wrong). Variance is in squared units. Take the square root to get the standard deviation, which is in the variable's real units and is what you report.
"Insurance has negative expected value, so nobody should buy it." Expected value isn't everything — a rare catastrophic loss has a huge standard deviation. People pay a small expected loss to remove a ruinous risk. E(X) and σ are a pair.

Scope flag

This outline stays within Objective 4 for the discrete case — recognizing, validating, and computing E(X), Var(X), and σ for a listable random variable. Continuous random variables are only named and recognized here; their probabilities-as-areas machinery (density curves, the normal model) lives in Weeks 7 and 9. The decision/expected-value-of-a-bet material (Segment 6) and the E(X²) − [E(X)]² shortcut are kept because they make the ideas stick and power the week's discussion and assignment; trim Segment 6 to a single example for a leaner 60-minute version.

~ Prof. Rivera's edition · Fall 2026 · built with thecoursemaker.com