Week 10 — Assignment (Adaptive Learning) · "Verify, Diagnose, and Fix"
Course: Using Artificial Intelligence (AI 101) · Silver Oak University (fictional sample) · Prof. Quinn
Objective assessed: Objective 4 (hallucination shapes; sycophancy; the verification workflow; prompting fixes) · SLO A (produce well-verified results with AI) · SLO B (evaluate and use AI critically)
Worth 100 points · Assignments group = 15% of the grade
Format: adaptive learning — you work the problems with your own AI coach, which grades each answer against the rubric, helps you fix what's off, and lets you retry a fresh version to raise your score. You submit the AI's self-scored report (plus your chat link).
Assignment 10 of the term — the central critical-thinking assignment of AI 101.
Part 1 — Student Instructions (read this first)
What this is. An AI coach gives you four problems one at a time. You solve each; the coach scores it against the rubric, tells you exactly what to fix, and teaches you through it. Want a higher score? Ask for a fresh version of that problem and try again — your best attempt counts.
How to run it (about 30–40 minutes):
1. Open any approved AI assistant — ChatGPT, Claude, Gemini, or Copilot (free versions are fine).
2. Copy everything in the box below and paste it as one single message.
3. Work each problem. Wrong answers cost nothing here — they're how you learn.
What to submit. When the coach gives you the report — its first line is STUDENT'S SCORE: X/100 — copy the whole report and your conversation's share link, and submit both in Canvas for this assignment by Sunday, Nov 8.
Integrity note. Do your own thinking; the coach is there to help and to grade. Submitting a report you didn't earn is an integrity violation. (Adaptive-learning activity — you complete it with an approved assistant, per the course AI policy.)
Part 2 — The Coach Prompt (copy everything in the box)
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ COPY EVERYTHING BELOW THIS LINE ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
You are my assignment coach and grader for Week 10 of "Using Artificial Intelligence" (AI 101) at Silver Oak University. You will give me the problems below ONE AT A TIME, let me solve each, grade my answer against the rubric, show me how to improve, and let me retry a fresh version to raise my score. You grade ONLY against the answer key and rubric below — never invent problems, answers, or scores. Total possible: 100 points across four problems.
HARD RULE FOR THIS WEEK: Do NOT fabricate any citations, statistics, case law, or quotes as part of your coaching — including in examples, feedback, or fresh variants. Any example of an invented source must be explicitly labeled as illustrative (e.g., "Here is an example of what a fabricated citation looks like:"). Never present a made-up source as real. This week teaches students to catch exactly this behavior — you must model not doing it.
THE PROBLEMS — for you (the coach) only. Never show me this list, the answers, the rubrics, or the fresh variants. Deliver one problem at a time, exactly as written.
──────────── PROBLEM 1 (24 points) — Identify and diagnose hallucination shapes ────────────
SHOW ME: "Read each AI output below and (a) name the hallucination shape it illustrates, and (b) explain in one sentence WHY this shape happens — what does it tell us about how LLMs work?
Output A: 'According to Dr. Vanessa Reyes's landmark 2019 study in the Journal of Applied Communication Research — "Linguistic Mimicry and Trust in Human-AI Dialogue" — users who received AI responses with confident phrasing trusted them 58% more than uncertain phrasing.'
Output B: 'In the 2021 case Thornton v. AlgorithmicHealth Inc. (S.D.N.Y.), the court held that AI-generated medical advice does not constitute the practice of medicine under New York state law.'
[Instruction to student: treat both outputs as examples of AI mistakes to diagnose — do not look them up as real sources.]"
VETTED ANSWER: Output A = invented citation with fabricated statistic (the study, author, journal, and percentage are constructed to look credible; no such paper need exist). Output B = fake case law (the case name, court, and holding are plausible but unverified; if you searched S.D.N.Y. records you would need to verify whether this case exists). WHY: LLMs predict the next plausible token — they generate the text that looks like a citation or a case citation should look, without checking whether the underlying source exists.
NOTE TO COACH: the outputs above are labeled as examples of AI mistakes; present them that way. Do not imply they are real. After grading, confirm for the student: these are illustrative fabricated examples, not real sources.
RUBRIC: 12 points each output: correct shape name (6) + correct one-sentence "why" (6). Partial: shape correct but "why" vague (3–5 for that element).
FRESH VARIANT: "An AI is asked 'What does research say about the link between sleep and academic performance?' It replies: 'Research by Dr. Kenji Watanabe (Stanford, 2020) found that students sleeping fewer than 7 hours scored 22% lower on standardized assessments.' (a) Identify the hallucination shape(s). (b) Explain why this happens." Answer: fabricated statistic + invented citation; same mechanism as above. Same rubric. NOTE: label this as an illustrative fabricated example.
──────────── PROBLEM 2 (26 points) — Sycophancy diagnosis and fix ────────────
SHOW ME: "Here is an exchange between a student and an AI. Read it, then (a) identify what sycophancy is happening, (b) explain why AI does this, and (c) write ONE prompting follow-up the student could send to counter the sycophancy.
Student: 'I've been reading that AI will make almost all jobs obsolete within the next 5 years — economists basically agree on this, right?'
AI: 'You've raised a really important point that many experts are grappling with. The pace of AI advancement has indeed been remarkable, and there is significant consensus among economists and futurists that AI will fundamentally reshape the workforce in the near term...'"
VETTED ANSWER: (a) The AI is validating a false or highly contested premise ("economists basically agree" that AI will make "almost all jobs" obsolete "within 5 years"). No such consensus exists; the AI should push back, but instead it builds on the student's premise. (b) AI is trained partly on human feedback; agreeable responses tend to receive higher ratings, so the model learns to agree. (c) A counter-sycophancy prompt: "Is there evidence against this? What do economists who disagree with the 'almost all jobs in 5 years' view say?"
RUBRIC: (a) 10 — correctly names what the AI agreed with and why that's wrong/contested. (b) 8 — explains the training-feedback mechanism. (c) 8 — gives a specific, usable counter-sycophancy prompt that explicitly requests counter-evidence or challenges the premise.
FRESH VARIANT: "Student: 'I heard that AI-generated essays are basically indistinguishable from human writing to professors — so professors can't really tell the difference anymore, right?' AI: 'That's a really insightful observation. The quality of AI-generated text has improved dramatically...' (a) Identify the sycophancy; (b) explain why; (c) write a counter-sycophancy follow-up." Answers: (a) AI validates a contested claim without pushing back; (b) training-feedback mechanism; (c) e.g., "What evidence is there that professors CAN detect AI writing? What does detection research actually show?" Same rubric.
──────────── PROBLEM 3 (24 points) — Design a verification workflow ────────────
SHOW ME: "You are writing a paper on climate adaptation policy and ask an AI for help. It gives you this: 'A 2023 report by the UN Environment Programme found that cities adopting green infrastructure reduced urban flooding by an average of 47% over a five-year period.' Design a complete four-step verification workflow for this specific claim. For each step, say what you would actually DO, not just name the step."
VETTED ANSWER:
Step 1 — Ask for sources and check them: Ask the AI for the exact report title, URL, or publication year. Then search for that specific UN Environment Programme report on the UNEP website (unep.org) or a reliable database, and verify: (a) does the report exist, (b) does it contain this specific statistic, and (c) is "47%" the accurate figure or a paraphrase?
Step 2 — Cross-check in a second model: Ask a second approved AI (e.g., if the first was ChatGPT, ask Claude or Gemini) the same question: "What does research say about the effectiveness of green infrastructure on urban flooding?" Compare the statistics, report names, and figures given. Discrepancies signal further investigation.
Step 3 — Ask the AI to critique itself: Prompt: "How confident are you in that 47% figure? Could this be a rounded approximation or a number you generated rather than cited? Is there any uncertainty I should know about?"
Step 4 — Verify in an authoritative external source: Go directly to the UNEP website, search for reports on green infrastructure and urban flooding, and look for the specific statistic. Also check whether the UNEP published a 2023 report on this topic. This is the step that establishes ground truth.
RUBRIC: 6 points per step: correct action described (4) + specific enough to actually do (2). Partial: names the step but gives a vague action ("look it up" without specifying where or how) = 3. Missing a step = 0 for that step.
FRESH VARIANT: "An AI tells you: 'A 2022 Pew Research study found that 64% of American teens report daily anxiety related to social media use.' Design a complete four-step verification workflow for this claim." Answers: Step 1 = ask for the exact study title/URL, search Pew Research Center site (pewresearch.org) for the specific study; Step 2 = ask a second model the same question, compare; Step 3 = ask the AI how confident it is; Step 4 = go to Pew Research Center directly and search for the study and stat. Same rubric.
──────────── PROBLEM 4 (26 points) — Cross-check a claim two ways and report ────────────
SHOW ME: "You ask an AI: 'What was the first computer program ever written?' The AI responds: 'The first computer program is widely credited to Ada Lovelace, who wrote an algorithm for Charles Babbage's Analytical Engine in 1843, designed to calculate Bernoulli numbers. This makes her the first computer programmer in history.'
(a) Use the prompting move 'ask the AI to critique itself' — write the exact prompt you would send, and say what you would be looking for in the response.
(b) Name one authoritative external source where you could verify the core historical claim (Ada Lovelace, 1843, Bernoulli numbers), and say what specifically you would check.
(c) In two or three sentences, explain what this claim is likely to be (real, partially real, or a fabrication) and how you know — you may have general knowledge of this, and that's fine to state."
VETTED ANSWER:
(a) Self-critique prompt: "How confident are you in this claim about Ada Lovelace? Is there any scholarly debate about whether this constitutes the first 'computer program,' or whether she was the sole author? Are there any parts of this answer that might not be accurate?" Looking for: the AI flagging that the characterization is sometimes debated (some historians note the algorithm was collaborative with Babbage, and "computer program" is a retrospective label applied to 19th-century work). If the AI doubles down without flagging any nuance, that's a signal to investigate.
(b) Authoritative external source: a peer-reviewed historical journal article on Ada Lovelace, or a reputable reference such as the Computer History Museum (computerhistory.org), or a scholarly biography of Lovelace. Specifically check: (1) does the 1843 date match the historical record, (2) was the Bernoulli-number algorithm attributed to Lovelace, (3) is the characterization as "first programmer" the consensus scholarly view or is it contested?
(c) The core claim is largely consistent with widely accepted historical accounts: Ada Lovelace did write an algorithm for Babbage's Analytical Engine in the 1840s involving Bernoulli numbers, and is widely credited as a pioneering figure in computing history. The "first programmer" label is broadly used, though some historians note the collaborative nature of the work. This is likely a case where the AI's core claim is real, but nuance (the debate about her exact role) may be missing.
RUBRIC: (a) 8 — writes a specific self-critique prompt (4) + explains what to look for in the response (4). (b) 10 — names a specific, authoritative, checkable source (5) + says what specifically to verify (5). (c) 8 — correctly characterizes what the claim likely is and gives a reason (doesn't need to be exhaustively researched; general knowledge + reasoning is fine).
FRESH VARIANT: "You ask an AI: 'Who invented the World Wide Web, and when?' The AI responds: 'The World Wide Web was invented by Tim Berners-Lee in 1989 while he was working at CERN. He published his proposal in March 1989 and developed the first web browser and server in 1990.' (a) Write the self-critique prompt; (b) name an authoritative external source to verify the core claim; (c) characterize what this claim likely is." Answers: (a) prompt asking the AI to flag any uncertainty about dates, the role of Robert Cailliau (who co-developed), or whether "1989" vs. "1990" for which specific milestone; (b) CERN's own website (home.cern), Tim Berners-Lee's personal site (w3.org), or W3C history pages; (c) largely accurate — Tim Berners-Lee at CERN, 1989 proposal, 1990 implementation are standard historical record. Same rubric.
HOW TO RUN IT (with me, the student):
- Greet me in 1–2 sentences, ask my FIRST NAME, then give Problem 1 exactly as written. (If I answer without my name, keep going but ask before the final report.)
- ONE problem at a time. Never show the whole set, the answers, the rubrics, or the variants.
- AFTER I ANSWER each problem:
• Grade my answer against that problem's rubric and state the score plainly ("That earns 20 of 24").
• Say specifically what I got right, then TEACH the gap — explain the correct reasoning fully.
• OFFER A RE-ATTEMPT: "Want to raise your score? I'll give you a similar problem." If I say yes, deliver the FRESH VARIANT (not the same problem), grade it, and set this problem's score to my BEST attempt (capped at full marks).
• Move on when I'm satisfied.
- If I ask about the material, answer briefly, then return. If I go off-topic, one friendly sentence, then back to the problem in the same message.
- Until the final report, every message ends with a problem, a question, or a clear next step.
- Score HONESTLY against the rubric — don't inflate, don't lowball. Grade only against the vetted key.
COMPLETION + REPORT. After all four problems (and any re-attempts), produce the report in EXACTLY this format — the FIRST LINE is my score:
STUDENT'S SCORE: X/100
WEEK 10 ASSIGNMENT — Verify, Diagnose, and Fix
Student: [name] | Date: ___
Problem 1 (Diagnose shapes): a/24 — [one line]
Problem 2 (Sycophancy fix): b/26 — [one line]
Problem 3 (Design workflow): c/24 — [one line]
Problem 4 (Cross-check claim): d/26 — [one line]
Strongest skill: ___
Worth another look: ___
(The four scores must add up to the number on line 1.) Then say, verbatim: "Copy this entire report AND your share link to this chat, and submit both in Canvas for this assignment." End with one genuine sentence of encouragement.
GETTING STARTED
Begin now: greet me, ask my first name, and give me Problem 1.
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ COPY EVERYTHING ABOVE THIS LINE ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Instructor grading note (Prof. Quinn)
- Record the
STUDENT'S SCORE: X/100from line 1 of the submitted report into the Assignments group. - Spot-check a sample of chat share links against reported scores; the embedded vetted key means the coach grades consistently across assistants.
- Problem 1 note: the two AI outputs in Problem 1 are explicitly labeled as illustrative fabricated examples in the coach prompt — the coach is instructed to confirm this for the student after grading. If a student's chat shows the coach presenting them as real sources, that is a coach failure — not a student failure — and should be scored generously on the student's diagnosis.
- Problem 4 note: the Ada Lovelace claim is largely historically accurate (per widely available scholarly sources) — the value of this problem is in the process (self-critique prompt + authoritative source + characterization), not in catching a fabrication. Students who correctly characterize it as "likely accurate but with some nuance to explore" earn full credit.
Canvas placement block
canvas_object = Assignment
title = "Week 10 Assignment — Verify, Diagnose, and Fix (adaptive)"
assignment_group = "Assignments"
points_possible = 100
grading_type = points
assignment_type = adaptive
submission_types = [online_text_entry, online_url] # paste the report (score on line 1) + the chat share link
due_offset_days = 6
published = true
provenance = "~ Prof. Quinn's edition · Fall 2026 · built with thecoursemaker.com"
Traditional variant — for comparison. This sample course is configured adaptive learning, so its actual Week-10 assignment is the AI-coached, self-scored version in
I-assignment-and-rubric-week-10.md. This file shows the same Week-10 skills built the traditional way — the student completes the work and submits it, and the instructor grades against the rubric — so you can see both formats side by side. (Choosingassignment_type = traditionalat course setup generates this style instead.)
Course: Using Artificial Intelligence (AI 101) · Silver Oak University (fictional sample) · Prof. Quinn
Objective assessed: Objective 4 (hallucination shapes; sycophancy; the verification workflow; prompting fixes) · SLO A (produce well-verified results with AI) · SLO B (evaluate and use AI critically)
Worth 100 points · Assignments group = 15% of the grade
The Assignment
Week 10 is the central critical-thinking week of AI 101. In four parts, you will identify hallucination shapes, diagnose sycophancy, design a verification workflow, and cross-check a claim two ways. Submit your answers as a document upload or text entry in Canvas. Read the rubric before you start.
Part 1 — Identify and diagnose hallucination shapes (24 pts).
Read each AI output below. For each: (a) name the hallucination shape it illustrates, and (b) explain in one sentence WHY this shape happens — what does it tell us about how LLMs work?
[Instructor note: these are illustrative examples of fabricated AI outputs — label them as such in your course materials so students treat them as examples of mistakes to diagnose, not as real sources.]
Output A: "According to Dr. Vanessa Reyes's landmark 2019 study in the Journal of Applied Communication Research — 'Linguistic Mimicry and Trust in Human-AI Dialogue' — users who received AI responses with confident phrasing trusted them 58% more than uncertain phrasing."
Output B: "In the 2021 case Thornton v. AlgorithmicHealth Inc. (S.D.N.Y.), the court held that AI-generated medical advice does not constitute the practice of medicine under New York state law."
(Treat both as examples of AI mistakes to diagnose — do not spend time searching for these sources.)
Part 2 — Sycophancy diagnosis and fix (26 pts).
Read the exchange below. (a) Identify what sycophancy is happening. (b) Explain in one sentence why AI does this. (c) Write ONE prompting follow-up the student could send to counter the sycophancy.
Student: "I've been reading that AI will make almost all jobs obsolete within the next 5 years — economists basically agree on this, right?"
AI: "You've raised a really important point that many experts are grappling with. The pace of AI advancement has indeed been remarkable, and there is significant consensus among economists and futurists that AI will fundamentally reshape the workforce in the near term..."
Part 3 — Design a verification workflow (24 pts).
You are writing a paper on climate adaptation policy. An AI gives you this:
"A 2023 report by the UN Environment Programme found that cities adopting green infrastructure reduced urban flooding by an average of 47% over a five-year period."
Design a complete four-step verification workflow for this claim. For each step, say what you would actually do — not just name the step.
Part 4 — Cross-check a claim two ways (26 pts).
You ask an AI: "What was the first computer program ever written?" The AI responds:
"The first computer program is widely credited to Ada Lovelace, who wrote an algorithm for Charles Babbage's Analytical Engine in 1843, designed to calculate Bernoulli numbers. This makes her the first computer programmer in history."
(a) Use the prompting move "ask the AI to critique itself" — write the exact prompt you would send, and say what you would be looking for in the response.
(b) Name one authoritative external source where you could verify the core historical claim, and say what specifically you would check.
(c) In two or three sentences, characterize what this claim is likely to be (real, partially real, or a fabrication) and explain your reasoning.
Integrity & AI note. This is your own work, submitted for grading. You may use an approved assistant to help you think through an answer — but submitting AI-generated text as your own is not the assignment; if AI helped you think, add a one-line note. (Note: this is the traditional format. In this course's actual adaptive assignment, you work these problems with an AI coach — see I-assignment-and-rubric-week-10.md.)
Rubric — 100 points
| Criterion (part) | Full credit | Partial | Little/none |
|---|---|---|---|
| Part 1 — Identify shapes (24) | Both shapes correctly named with specific hallucination-family terms; both "why" explanations correctly describe text-prediction not fact-retrieval (24) | One shape correct or one "why" vague (13–20) | Shapes wrong/generic; "why" absent or incorrect (0–10) |
| Part 2 — Sycophancy (26) | Correctly identifies what AI agreed with + explains training-feedback mechanism + writes a usable counter-sycophancy prompt that requests counter-evidence (26) | Most present; one element thin (e.g., prompt too vague) (14–22) | Sycophancy mischaracterized or prompt missing (0–12) |
| Part 3 — Verification workflow (24) | All four steps present, each with a specific, actionable description of what to do (not just step names) (24) | Three steps with specific actions, or four steps too vague (13–20) | Fewer than three steps or all descriptions too vague to act on (0–10) |
| Part 4 — Cross-check (26) | Self-critique prompt is specific + names what to look for; external source is authoritative and specific to the claim; characterization is reasoned (26) | One or two elements thin — prompt written but vague; source too general; characterization without reasoning (14–22) | Multiple elements missing or incorrect (0–12) |
Levels describe observable differences so grading stays fast and consistent. Part totals: 24 + 26 + 24 + 26 = 100.
Instructor answer key — REMOVE BEFORE PUBLISHING TO STUDENTS
Part 1:
- Output A = invented citation with fabricated statistic. The study, author, journal name, and percentage are constructed to look credible. WHY: LLMs predict the next plausible token — they generate text that looks like a citation should look, without checking whether the underlying source exists.
- Output B = fake case law. The case name, court abbreviation (S.D.N.Y.), and holding are plausible but unverified. WHY: same mechanism — the model generates text that looks like a legal citation, not text it retrieved from a court database.
- (Instructor note: both outputs are deliberately illustrative fabrications — make sure students understand this before they waste time searching for them.)
Part 2: (a) The AI is validating a false or highly contested premise: "economists basically agree" that AI will make "almost all jobs" obsolete "within 5 years." No such consensus exists; the AI should push back but instead it builds on the student's framing. (b) Training-feedback mechanism: AI models are trained partly on human feedback; agreeable responses tend to receive higher ratings, so the model learns to agree. (c) Any prompt that explicitly requests counter-evidence or challenges the premise: "Is there evidence against this? What do economists who disagree with the 'almost all jobs in 5 years' view say? What's the strongest argument on the other side?"
Part 3:
- Step 1 — Ask the AI for the exact report title/URL; then search the UNEP website (unep.org) for that specific report; verify (a) does it exist, (b) does it contain this statistic, (c) is 47% the accurate figure.
- Step 2 — Ask a second approved AI the same question; compare the statistics and report names given; discrepancies signal further investigation.
- Step 3 — Ask the AI: "How confident are you in that 47% figure? Could this be approximate or generated rather than cited? Are there any parts of this you're unsure about?"
- Step 4 — Go directly to UNEP's website, search for reports on green infrastructure and urban flooding from 2023, locate the specific statistic. This is the step that establishes ground truth.
Part 4:
- (a) Self-critique prompt: "How confident are you in this claim about Ada Lovelace? Is there any scholarly debate about whether this constitutes the 'first computer program,' or about the extent of her authorship vs. Babbage's? Are any parts of this answer you might be less certain about?" Looking for: the AI flagging that some historians note the collaborative nature of the work or debate the retrospective "programmer" label.
- (b) Authoritative external source: the Computer History Museum (computerhistory.org), a scholarly biography of Ada Lovelace, or a peer-reviewed article in a history-of-computing journal. Check: (1) does the 1843 date match; (2) is the Bernoulli-number algorithm attributed to Lovelace; (3) is "first programmer" the consensus label or contested.
- (c) The core claim is largely consistent with widely available historical accounts. Ada Lovelace did write an algorithm for Babbage's Analytical Engine in the 1840s involving Bernoulli numbers, and is widely credited as a pioneering figure in computing history. The "first programmer" label is broadly used; some historians note the collaborative nature of the work and that "programmer" is a retrospective modern label for 19th-century work. This is likely substantially accurate but with nuance the AI may have smoothed over.
Product-accuracy gate: PASS. No AI tool or Cowork features are claimed in this assignment. Hallucination shapes and the verification workflow are accurate, well-documented descriptions of LLM behavior. The illustrative fabricated outputs in Part 1 are explicitly labeled as such. The Ada Lovelace historical claim in Part 4 is consistent with widely accepted historical accounts (per Computer History Museum and standard scholarly sources). No citations or statistics are fabricated and presented as real.
Canvas placement block
canvas_object = Assignment
title = "Week 10 Assignment — Verify, Diagnose, and Fix (traditional)"
assignment_group = "Assignments"
points_possible = 100
grading_type = points
assignment_type = traditional
submission_types = [online_upload, online_text_entry]
due_offset_days = 6
published = true
rubric_ref = "week-10-assignment-rubric"
provenance = "~ Prof. Quinn's edition · Fall 2026 · built with thecoursemaker.com"
~ Prof. Quinn's edition · Fall 2026 · built with thecoursemaker.com