Part VII: AI Applications
Chapter 28: LLM Applications

Education, Legal & Creative Industries

"I never give the answer. I give the question that makes you discover the answer yourself."

Deploy Deploy, Socratically Stubborn AI Agent
Big Picture

LLMs are reshaping how people learn, practice law, create art, and interact with services. In education, AI tutors provide personalized instruction at scale. In law, LLMs analyze contracts and conduct legal research in minutes rather than hours. In creative industries, they serve as co-authors, brainstorming partners, and content generation tools. In customer support, they handle routine queries while escalating complex issues to humans. Each domain illustrates a different facet of human-AI collaboration, from full automation to creative partnership. The conversational AI patterns from Chapter 21 and the agent architectures from Chapter 22 provide the technical foundations for these applications.

Prerequisites

This section builds on the application patterns from earlier sections in this chapter, particularly Section 28.1. Understanding agent architectures from Section 22.1 and API optimization from Section 10.3 provides important context for production deployment patterns.

1. Education and AI Tutoring

AI tutoring systems represent one of the most promising applications of LLMs.

Fun Fact

Benjamin Bloom's 1984 "2 Sigma Problem" showed that one-on-one tutoring raised average students to the 98th percentile. The catch was that hiring a personal tutor for every student was economically impossible. LLMs are the first technology that might actually close that gap at scale.

Khan Academy's Khanmigo and Duolingo Max demonstrate how LLMs can deliver personalized instruction that adapts to each student's pace, knowledge level, and learning style. The key insight from educational research is that the Socratic method (guiding students to discover answers through questions rather than providing direct answers) is more effective for learning, and LLMs can be prompted to tutor in this style. Code Fragment 28.6.2 below puts this into practice.


# implement socratic_tutor
# Key operations: results display, API interaction
from openai import OpenAI

client = OpenAI()

def socratic_tutor(subject: str, student_question: str, student_level: str) -> str:
 response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": f"""You are a Socratic tutor for {subject}.
Student level: {student_level}
NEVER give direct answers. Instead:
1. Acknowledge what the student understands correctly
2. Ask a guiding question that leads toward the answer
3. If they are stuck, break the problem into smaller steps
4. Celebrate progress and correct misconceptions gently
5. Adapt language complexity to the student's level"""},
 {"role": "user", "content": student_question},
 ],
 )
 return response.choices[0].message.content

reply = socratic_tutor(
 subject="calculus",
 student_question="I don't understand why the derivative of x^2 is 2x",
 student_level="high school",
)
print(reply)
That's a great question! Let's think about it step by step. You already know that the derivative of x^2 is 2x, which is correct. Now, let's explore *why* that's true. Do you remember the definition of a derivative? It involves a limit. Can you write out what happens when we compute: lim(h->0) [(x+h)^2 - x^2] / h Try expanding (x+h)^2 first. What do you get when you multiply that out?
Code Fragment 28.6.1: implement socratic_tutor
Note

Duolingo Max uses GPT-4 for two features: "Explain My Answer" (why a response was right or wrong, with grammar explanations) and "Roleplay" (conversational practice with AI characters in realistic scenarios). The system maintains the user's proficiency level, tracks common mistakes, and adapts difficulty. This demonstrates how LLMs enable a capability (open-ended conversation practice) that was previously impossible in a self-study language app.

Tip

If you are building an AI tutoring system, resist the urge to have the LLM give answers directly. The most effective tutoring pattern is Socratic: the LLM asks leading questions, provides hints, and only explains the solution after the student has attempted the problem. Implement this as a system prompt constraint ("Never provide the answer directly. Instead, ask a clarifying question or give a hint.") and monitor for compliance by sampling conversations weekly.

2. Legal Applications

Legal work is inherently text-intensive: reading contracts, researching case law, drafting documents, and conducting due diligence. LLMs accelerate all of these tasks while raising important questions about accuracy, liability, and the unauthorized practice of law. Figure 28.6.1 shows the legal AI workflow with its verification steps. Code Fragment 28.6.5 below puts this into practice.

Contract Analysis

This snippet extracts key clauses, obligations, and risk factors from a legal contract using an LLM.


# implement analyze_contract
# Key operations: attention mechanism, API interaction
from openai import OpenAI
import json

client = OpenAI()

def analyze_contract(contract_text: str) -> dict:
 response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": """Analyze this contract and extract:
- parties (names and roles)
- key_dates (effective, termination, renewal)
- financial_terms (amounts, payment schedule, penalties)
- obligations (each party's key obligations)
- risk_clauses (limitation of liability, indemnification, IP)
- unusual_terms (anything atypical that needs attention)
Return structured JSON. Flag items requiring legal review."""},
 {"role": "user", "content": contract_text},
 ],
 response_format={"type": "json_object"},
 )
 return json.loads(response.choices[0].message.content)
Here are three opening options for your 1920s noir mystery: **Option A: The Cigarette Smoke Confessional** The gin at Palazzo's tasted like regret and turpentine, which suited Jack Morrow just fine. He'd been nursing the same glass for an hour, watching the saxophone player sweat through a melody that had no business being that sad... **Option B: The Uninvited Guest** Nobody knocked at Palazzo's. You either knew the password or you didn't exist. So when the dead man's wife walked in without saying a word to Tony at the door, Jack Morrow knew his quiet evening had just ended... **Option C: The Overheard Conversation** ...
Code Fragment 28.6.2: implement analyze_contract
Legal Docs contracts, filings Contract Review clause extraction Legal Research case law, statutes Doc Drafting templates + custom E-Discovery relevance scoring Lawyer Review verify + approve
Figure 28.6.1: Legal AI workflow. LLMs handle contract review, research, drafting, and e-discovery, with lawyer review as the final verification step.
Key Insight

The recurring pattern across education, law, and creative work is that LLMs are most dangerous when used as oracles and most valuable when used as collaborators. A student who copies an LLM's answer learns nothing. A student who debates with an LLM tutor that asks Socratic questions learns deeply. A lawyer who trusts LLM-generated citations without verification risks sanctions. A lawyer who uses LLMs for first-pass review and then verifies every citation works faster with higher quality. The technology is the same in both cases; the workflow determines whether it helps or harms. This principle connects directly to the human-in-the-loop patterns from Section 24.5: the goal is graduated autonomy, not full replacement.

Warning

Legal AI systems have been caught generating citations to cases that do not exist. In a widely publicized 2023 incident, a lawyer submitted an LLM-generated brief containing fabricated case citations, resulting in sanctions. This illustrates the critical importance of verification in legal applications. LLMs should be used for drafting and research assistance, never as authoritative legal sources. All citations, quotations, and legal reasoning must be verified by a qualified attorney before submission.

3. Creative Writing and Co-Authorship

LLMs serve as creative collaborators in fiction writing, screenwriting, copywriting, and journalism. The most effective creative workflows use LLMs for brainstorming (generating ideas, plot outlines, character descriptions), drafting (producing initial text that the human refines), and editing (suggesting improvements to human-written text). Professional authors increasingly use LLMs not to write for them but to overcome creative blocks and explore narrative possibilities they might not have considered. Code Fragment 28.6.3 below puts this into practice.

# Creative writing assistant with style control
response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": """You are a creative writing assistant.
Help brainstorm, outline, and draft fiction. Match the user's
specified tone and style. Offer alternatives when generating
content. Never produce a single 'correct' version; always
provide options for the author to choose from."""},
 {"role": "user", "content": """I'm writing a mystery novel set in a
1920s speakeasy. I need three possible opening scenes that
establish atmosphere and introduce the detective protagonist.
Tone: noir, atmospheric, slightly sardonic."""},
 ],
)
print(response.choices[0].message.content)
Code Fragment 28.6.3: Creative writing assistant with style control

4. Customer Support and Gaming

LLM-powered customer support handles routine queries (order status, FAQ, troubleshooting) while escalating complex issues to human agents. The key architecture uses a retrieval-augmented generation (RAG) pipeline (Chapter 20) over the company's knowledge base combined with an LLM that generates contextual responses. In gaming, LLMs power NPC dialogue that responds dynamically to player actions, creating more immersive and unpredictable narrative experiences. Games like Inworld AI and AI Dungeon demonstrate how LLMs can generate interactive stories in real time.

Domain Comparison
Domain Key Application LLM Role Human Role
Education Personalized tutoring Socratic questioning, adaptation Curriculum design, oversight
Legal Contract analysis Extraction, first-pass review Verification, legal judgment
Creative Writing assistance Brainstorming, drafting Direction, refinement, voice
Support Query resolution Routine handling, KB search Complex issues, empathy
Gaming Dynamic NPC dialogue Responsive conversation World design, narrative arcs
Key Insight

Across education, law, creative work, and customer support, the pattern is remarkably consistent: LLMs are most effective when they augment human expertise rather than replacing it. A tutor who uses LLMs to generate adaptive practice problems is more effective than either the tutor or the LLM alone. A lawyer who uses LLMs for first-pass contract review can analyze more documents with greater thoroughness. A writer who brainstorms with an LLM explores more creative possibilities. The compound effect of human judgment plus AI capability exceeds either in isolation.

5. Style Transfer and Text Simplification

Style transfer adapts text from one register, tone, or reading level to another while preserving the core meaning. Before LLMs, style transfer required parallel corpora and specialized sequence-to-sequence models for each style pair. Today, a single prompted LLM can convert formal prose to conversational language, simplify medical jargon for patients, or adapt marketing copy to match a brand's voice. This capability matters because organizations communicate with diverse audiences: the same product description might need a technical version for engineers, an accessible version for consumers, and a simplified version for screen readers and language learners.

5.1 Prompt-Based Style Transfer

The key insight is that LLMs already understand style implicitly from their training data. Rather than training a separate model for each style pair, you describe the desired transformation in the prompt. This makes style transfer accessible to any team with API access.

# Style transfer: convert between registers, tones, and reading levels
from openai import OpenAI

client = OpenAI()

def transfer_style(
 text: str,
 source_style: str,
 target_style: str,
 context: str = "",
) -> str:
 """Rewrite text from one style to another, preserving meaning."""
 system_prompt = f"""You are an expert editor specializing in style adaptation.
Rewrite the given text from {source_style} to {target_style}.

Rules:
1. Preserve ALL factual content and key details
2. Adjust vocabulary, sentence structure, and tone
3. Do NOT add information that is not in the original
4. Match the target audience's reading level
{f'Additional context: {context}' if context else ''}"""

 response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": system_prompt},
 {"role": "user", "content": text},
 ],
 temperature=0.4,
 )
 return response.choices[0].message.content

# Example 1: Technical to accessible
technical = """The patient presents with acute myocardial infarction
characterized by ST-segment elevation in leads V1-V4, consistent
with anterior wall involvement. Troponin I levels are elevated
at 15.2 ng/mL. Emergent percutaneous coronary intervention is
indicated."""

accessible = transfer_style(
 text=technical,
 source_style="clinical medical terminology",
 target_style="plain language for a patient and their family",
 context="Target reading level: 8th grade",
)
print("Simplified:\n", accessible)

# Example 2: Formal to conversational brand voice
formal = """We are pleased to announce the availability of our latest
software release, which incorporates numerous enhancements to
system performance and user interface responsiveness."""

casual = transfer_style(
 text=formal,
 source_style="formal corporate announcement",
 target_style="friendly, conversational brand voice (like Slack or Notion)",
)
print("Brand voice:\n", casual)
Simplified: You are having a heart attack that is affecting the front part of your heart. A blood test shows high levels of a protein that confirms heart muscle damage. You need an emergency procedure to open the blocked artery in your heart. Brand voice: Big news! Our latest update just dropped, and it's faster and smoother than ever. We've tuned up the engine under the hood and polished the interface so everything feels snappier. Go check it out!
Code Fragment 28.6.4: Style transfer: convert between registers, tones, and reading levels

5.2 Applications

5.2 Applications Comparison
Application Source Style Target Style Why It Matters
Accessibility compliance Technical documentation Plain language (WCAG guidelines) Legal requirements, broader reach
Content localization US English, casual UK English, formal Market-appropriate communication
Brand voice adaptation Generic product copy Brand-specific tone and vocabulary Consistent brand identity at scale
Patient communication Medical records, clinical notes Patient-friendly summaries Health literacy, informed consent
Academic simplification Research papers Blog posts or press releases Public engagement with science
Warning

Style transfer can inadvertently change meaning, especially when simplifying technical content. A medical simplification that converts "contraindicated" to "not recommended" loses the severity of the original term. Always have domain experts review style-transferred content in high-stakes contexts (medical, legal, financial), and build validation checks that compare key facts between the original and the rewritten version. The evaluation techniques from Chapter 29 are directly applicable here.

6. Grammatical Error Correction

Grammatical Error Correction (GEC) is one of the oldest NLP tasks, with a rich history that illustrates how the field has evolved. The progression from rule-based systems to LLMs mirrors the broader trajectory of NLP itself. Understanding this evolution helps practitioners choose the right approach for their specific use case, whether that is a real-time writing assistant, a language learning tool, or a professional editing pipeline.

6.1 The Evolution of GEC

6.1 The Evolution of GEC Comparison
Era Approach Example Strengths Limitations
1980s-2000s Rule-based LanguageTool, custom grammars Predictable, explainable, fast Cannot handle novel errors, brittle
2010s Statistical MT Phrase-based SMT for GEC Learns from error corpora Needs large parallel corpora
2018-2021 Neural (seq2seq, tagging) GECToR, T5-GEC Fast inference, high precision Limited to trained error types
2022+ LLM-based GPT-4, Claude, Gemini Handles any error type, explains fixes Higher latency, may over-correct style
Fun Fact

GECToR (Omelianchuk et al., 2020) frames grammar correction as a sequence tagging problem rather than a sequence-to-sequence problem. Instead of regenerating the entire sentence, it predicts edit operations (keep, delete, insert, replace) for each token. This makes it 10x faster than generative approaches and, crucially, it only changes what needs changing. LLMs, by contrast, sometimes "correct" perfectly valid stylistic choices.

6.2 LLM-Based GEC in Practice

This snippet uses an LLM to perform grammar error correction on input text with few-shot prompting.

# LLM-based grammatical error correction with explanations
from openai import OpenAI
import json

client = OpenAI()

def correct_grammar(text: str, explain: bool = True) -> dict:
 """Correct grammatical errors with optional explanations."""
 response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": """You are a grammar correction assistant.
Fix grammatical errors in the given text. Return JSON with:
- "corrected": the corrected text
- "changes": a list of objects, each with "original", "corrected", and "rule"
 (the grammar rule that applies, e.g., "subject-verb agreement")

Rules:
1. Fix grammar, spelling, and punctuation errors
2. Do NOT change style, tone, or word choice unless it is a clear error
3. Preserve the author's voice and intent
4. If the text is already correct, return it unchanged with an empty changes list"""},
 {"role": "user", "content": text},
 ],
 response_format={"type": "json_object"},
 temperature=0.0,
 )
 return json.loads(response.choices[0].message.content)

# Example: correct a paragraph with multiple error types
text = """Their going to the store tommorrow, but neither the manager
nor the employees is aware of the scheduile change. The report,
along with its appendicies, were submitted late."""

result = correct_grammar(text)
print("Corrected:", result["corrected"])
for change in result["changes"]:
 print(f" {change['original']} -> {change['corrected']}")
 print(f" Rule: {change['rule']}")
Corrected: They're going to the store tomorrow, but neither the manager nor the employees are aware of the schedule change. The report, along with its appendices, was submitted late. Their -> They're Rule: homophone confusion (possessive vs. contraction) tommorrow -> tomorrow Rule: spelling is -> are Rule: subject-verb agreement (neither/nor takes plural verb when nearest subject is plural) scheduile -> schedule Rule: spelling appendicies -> appendices Rule: spelling (irregular plural) were -> was Rule: subject-verb agreement ("report" is the subject, not "appendices")
Code Fragment 28.6.5: LLM-based grammatical error correction with explanations

6.3 Choosing the Right GEC Approach

The best GEC system depends on your constraints. For real-time writing assistance in a text editor, GECToR or a similar tagging model provides the low latency (under 50ms) that users expect. For language learning applications, LLM-based GEC is superior because it can explain each correction and provide grammar lessons. For professional editing workflows that process batch documents, a hybrid approach works well: run a fast tagging model first to catch common errors, then optionally invoke an LLM for complex cases involving nuanced word choice or sentence restructuring. The hybrid pattern from Section 28.1 applies directly.

Real-World Scenario: AI-Powered Contract Review at a Law Firm

Who: M&A practice group at a mid-tier corporate law firm

Situation: Due diligence for acquisitions required reviewing 500 to 2,000 contracts per deal. Junior associates spent 60 to 80 hours per deal on initial contract review, extracting key terms: change of control clauses, assignment restrictions, termination provisions, and indemnification caps.

Problem: Manual review was slow (limiting the number of deals the team could handle) and error-prone (associates missed 8% of material clauses in audit tests).

Decision: The firm deployed a RAG-based contract analysis system using Claude 3.5 Sonnet with a custom extraction schema for each deal type.

How: Contracts were ingested as PDFs, converted to text with OCR (using Document AI techniques from Section 27.3), and chunked by section. The LLM extracted structured data for each clause type, with citations pointing to specific section numbers and page references. Every extraction included a confidence score; items below 0.85 confidence were flagged for attorney review. The system also compared extracted terms against a "market standard" database to highlight unusual or aggressive provisions.

Result: Contract review time dropped from 80 hours to 12 hours per deal (85% reduction). The missed clause rate fell from 8% to under 1%. The firm increased deal throughput by 40% without hiring additional associates.

Lesson: Legal AI works best with structured extraction schemas that mirror how attorneys think about contract analysis, combined with mandatory human review for low-confidence extractions. The citation and audit trail requirements make RAG essential over pure generation.

Production Tip

Recent tools for legal and education AI (2024/2025). For legal applications: Harvey AI provides a purpose-built legal LLM with enterprise security, used by Allen & Overy and other major firms. CoCounsel (by Thomson Reuters/Casetext) integrates with Westlaw for case law research with cited sources. For contract analysis, Luminance and Kira Systems offer pre-trained extractors for common clause types. For education: beyond Khanmigo, Synthesis Tutor uses game-based learning with LLM-powered adaptive difficulty. Quizlet Q-Chat provides AI tutoring across subjects using a Socratic dialogue approach. For building custom education tools, the OpenAI Assistants API with file search and code interpreter tools provides a solid foundation for building domain-specific tutors that can reference textbooks, solve problems step by step, and maintain student progress across sessions.

Self-Check
Q1: Why is Socratic tutoring more effective than direct answer provision for AI education tools?
Show Answer
Socratic tutoring guides students to discover answers through questioning rather than giving answers directly. This promotes deeper understanding, active engagement, and long-term retention. Students who work through problems with guided hints develop problem-solving skills that transfer to new situations, while students who receive direct answers may memorize solutions without understanding underlying concepts.
Q2: What are the main risks of using LLMs for legal research?
Show Answer
The primary risk is hallucination: LLMs can generate plausible-sounding but fictitious case citations, misstate legal holdings, or invent statutory provisions. Additional risks include: outdated legal knowledge (laws change frequently), jurisdiction confusion (applying law from wrong jurisdiction), and oversimplification of nuanced legal reasoning. All LLM-generated legal content must be verified by a qualified attorney.
Q3: How does Duolingo Max use LLMs to improve language learning?
Show Answer
Duolingo Max uses GPT-4 for "Explain My Answer" (explaining why responses are correct or incorrect with grammar rules) and "Roleplay" (conversational practice with AI characters in realistic scenarios like ordering at a cafe). These features enable open-ended conversation practice and detailed grammatical explanations that were previously impossible in a self-study app.
Q4: What makes LLM-powered customer support different from traditional chatbots?
Show Answer
Traditional chatbots follow predefined decision trees and can only handle anticipated queries. LLM-powered support understands natural language nuance, handles unexpected phrasings, retrieves relevant information from knowledge bases, and generates contextual responses. It can also know when it cannot help and escalate to human agents with a summary of the conversation, creating a smoother handoff.
Q5: How do LLMs enhance NPC interactions in gaming?
Show Answer
LLMs enable NPCs to engage in dynamic, contextual conversations that respond to specific player actions and choices rather than selecting from pre-written dialogue trees. This creates more immersive experiences where NPCs remember previous interactions, react to the game state, and can discuss topics the game designers did not explicitly script, making each playthrough feel more unique and responsive.
Real-World Scenario: Socratic AI Tutor for University Computer Science

Who: Computer science department at a large state university with 2,000 students in introductory programming courses

Situation: Office hours were overwhelmed, with students waiting 90+ minutes for help. Teaching assistants could not scale to meet demand, especially during assignment deadlines.

Problem: Students needed personalized debugging help and conceptual explanations at all hours, not just during scheduled office hours. A simple Q&A bot gave away answers, which undermined learning.

Dilemma: Providing direct answers helped students complete assignments faster but reduced learning outcomes. Refusing to help frustrated students and drove them to external AI tools with no pedagogical guardrails.

Decision: The department deployed a Socratic AI tutor that guided students through problem-solving with leading questions, hints, and conceptual explanations without ever providing complete solutions.

How: The system was prompted to analyze student code, identify misconceptions, and ask targeted questions that led students to discover errors themselves. It tracked conversation history to avoid repeating hints and escalated to a human TA when students remained stuck after three hint cycles. All interactions were logged for instructors to review.

Result: Office hour wait times dropped to under 10 minutes. Students who used the Socratic tutor scored 12% higher on exams than those who did not, indicating genuine learning rather than dependency. TA satisfaction improved because they handled fewer repetitive questions and focused on complex conceptual discussions.

Lesson: AI tutoring systems that use Socratic questioning instead of direct answers produce measurably better learning outcomes while scaling to serve thousands of students simultaneously.

7. Data-to-Text Generation

Data-to-text generation converts structured data (tables, databases, statistical records, JSON objects) into fluent natural language narratives. This task has a long history in NLP, with early systems producing weather forecasts and sports recaps from structured databases using template-based approaches. LLMs have transformed this space by enabling flexible, context-aware generation that adapts tone, detail level, and narrative structure to the target audience without handcrafted templates.

Classic data-to-text domains include sports journalism (generating game recaps from box scores), financial reporting (converting quarterly earnings tables into analyst-readable summaries), weather narratives (translating meteorological data into forecasts), and business intelligence (turning dashboard metrics into executive summaries). In each case, the core challenge is the same: faithfully representing the data while producing text that reads naturally and highlights the most important patterns.

7.1 LLM Approach: Serialize and Generate

The standard LLM approach to data-to-text generation follows a two-step pattern. First, serialize the structured data into a text format the model can process (Markdown tables, CSV rows, or JSON objects). Second, prompt the model with instructions specifying the desired output format, audience, and which aspects of the data to emphasize. The serialization format matters: Markdown tables tend to produce the most reliable results because LLMs encounter them frequently in training data. Code Fragment 28.6.2 demonstrates this pattern.

# Data-to-text generation: convert a statistics table to narrative text
# Serializes structured data as Markdown, then generates a narrative
from openai import OpenAI

client = OpenAI()

def stats_to_narrative(
 data: list[dict],
 context: str,
 audience: str = "general",
 tone: str = "professional"
) -> str:
 """Convert structured data into a natural language narrative."""
 # Step 1: serialize data as a Markdown table
 if not data:
 return "No data available."
 headers = list(data[0].keys())
 md_table = "| " + " | ".join(headers) + " |\n"
 md_table += "| " + " | ".join(["---"] * len(headers)) + " |\n"
 for row in data:
 md_table += "| " + " | ".join(str(row[h]) for h in headers) + " |\n"

 # Step 2: generate narrative from serialized data
 response = client.chat.completions.create(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": f"""You are a data analyst who
writes clear {tone} narratives for a {audience} audience. Convert the
provided data table into a coherent paragraph that:
1. States the key finding or trend first
2. Supports it with specific numbers from the data
3. Notes any notable outliers or changes
4. Keeps the narrative concise (3 to 5 sentences)
NEVER fabricate numbers. Only reference values present in the data."""},
 {"role": "user", "content": f"Context: {context}\n\n{md_table}"}
 ],
 temperature=0.3
 )
 return response.choices[0].message.content

# Example: quarterly sales data
sales_data = [
 {"Quarter": "Q1 2024", "Revenue ($M)": 142, "Growth (%)": 8.2,
 "Top Region": "North America"},
 {"Quarter": "Q2 2024", "Revenue ($M)": 158, "Growth (%)": 11.3,
 "Top Region": "Europe"},
 {"Quarter": "Q3 2024", "Revenue ($M)": 151, "Growth (%)": 6.4,
 "Top Region": "North America"},
 {"Quarter": "Q4 2024", "Revenue ($M)": 189, "Growth (%)": 25.2,
 "Top Region": "Asia Pacific"},
]

narrative = stats_to_narrative(
 sales_data,
 context="Annual revenue performance for CloudTech Inc.",
 audience="executive leadership",
 tone="concise and data-driven"
)
print(narrative)
CloudTech Inc. delivered strong revenue growth in 2024, reaching $189M in Q4 with a standout 25.2% growth rate that far exceeded earlier quarters. Full-year revenue totaled $640M across all four quarters, with Q2 posting the second-highest growth at 11.3% driven by European expansion. Notably, Asia Pacific emerged as the top-performing region in Q4, signaling a geographic shift from the North America dominance seen in Q1 and Q3. The Q3 deceleration to 6.4% growth represents the only soft spot in an otherwise accelerating trajectory.
Code Fragment 28.6.6: Data-to-text generation: convert a statistics table to narrative text
Key Insight

The biggest risk in data-to-text generation is hallucinated statistics. LLMs may invent plausible-sounding numbers that do not appear in the source data, or perform arithmetic incorrectly when computing aggregates. Production systems should include a verification step that cross-checks every number in the generated text against the original data. For critical applications (financial reports, medical summaries), consider a two-pass approach: generate the narrative first, then use a separate validation prompt that compares every factual claim against the source table. The evaluation techniques from Chapter 29 are directly applicable here.

Key Takeaways
Research Frontier

Adaptive learning systems are moving beyond simple Socratic prompting toward models that maintain detailed student cognitive models, predicting which misconceptions a student holds and selecting optimal teaching strategies in real time. In legal AI, researchers are building systems that can reason about legal precedent chains and identify relevant case law with verifiable citations, addressing the hallucination problem that has plagued legal LLM applications.

Creative AI is exploring models that can maintain consistent style, voice, and narrative arc across novel-length works, moving beyond short-form generation into sustained creative collaboration.

Exercises

Exercise 28.6.1: AI Tutoring System Design Conceptual

Design an AI tutoring system that adapts to the student's knowledge level. How should the system assess understanding, select difficulty, and provide feedback?

Answer Sketch

Assessment: start with diagnostic questions to gauge baseline knowledge. Track correct/incorrect responses and time-to-answer. Difficulty selection: use a mastery model (move to harder content after N consecutive correct answers). Feedback: for incorrect answers, do not just give the correct answer; ask guiding questions that lead the student to discover the error. For correct answers, reinforce understanding with a brief explanation of why the answer is right.

Exercise 28.6.2: Legal Document Analysis Coding

Write a prompt that takes a contract clause and identifies: the parties involved, the obligations of each party, any conditions or exceptions, and potential risks. Return structured JSON.

Answer Sketch

The prompt should instruct the model to parse the clause systematically: identify named parties, extract each party's obligations (must/shall language), find conditions (if/when/unless clauses), identify exceptions and limitations, and flag ambiguous language that could be disputed. Output: {parties: [], obligations: [{party, obligation, condition}], risks: [{description, severity}]}. Include a disclaimer that this is not legal advice.

Exercise 28.6.3: Creative Co-Authorship Conceptual

Discuss the role of LLMs as creative co-authors. How should the collaboration between human and AI be structured to produce the best creative output? What are the copyright implications?

Answer Sketch

Best structure: the human provides creative direction, themes, and key decisions; the AI generates drafts, alternatives, and expansions; the human curates, edits, and makes final choices. This keeps human creative intent at the center. Copyright implications vary by jurisdiction: in the US, AI-generated content without significant human authorship may not be copyrightable. The key is demonstrating substantial human creative contribution to the final work.

Exercise 28.6.4: Style Transfer Implementation Coding

Write a function that uses an LLM to transfer the writing style of one text to another. Given a source text and a target style description (e.g., 'formal academic'), produce a rewritten version.

Answer Sketch

Provide the LLM with: the source text, the target style description, and examples of the target style (few-shot). The prompt should instruct the model to preserve the original meaning while changing vocabulary, sentence structure, and tone to match the target style. Test with: informal email to formal business letter, technical documentation to layperson explanation.

Exercise 28.6.5: Customer Support Automation Conceptual

Design a customer support system that uses an LLM to handle Tier 1 inquiries and escalate complex issues to human agents. What criteria should trigger escalation?

Answer Sketch

The LLM handles: FAQ responses, order status lookups, password resets, and simple troubleshooting. Escalation triggers: customer expresses frustration or anger (sentiment detection), the issue involves financial disputes, the LLM's confidence is below a threshold, the same issue has been raised multiple times without resolution, or the customer explicitly requests a human agent. Always inform the customer when they are talking to an AI.

What Comes Next

In the next section, Section 28.7: Robotics, Embodied AI & Scientific Discovery, we cover robotics, embodied AI, and scientific discovery, the frontier of LLM applications in the physical world.

Bibliography

Education

Khan Academy. (2024). "Khanmigo: AI for Education." https://www.khanacademy.org/khan-labs

Documentation for Khan Academy's AI tutor that uses GPT-4 for Socratic tutoring across math, science, and humanities. Illustrates how to design AI tutoring that guides students rather than providing answers directly. Essential reference for anyone building educational AI products.
Education

Duolingo. (2024). "Duolingo Max: AI-Powered Language Learning." https://blog.duolingo.com/duolingo-max/

Describes Duolingo's integration of GPT-4 for conversational practice and grammar explanations in language learning. Shows how to embed LLMs in consumer apps with appropriate guardrails and personalization. Recommended for product designers exploring AI-augmented learning experiences.
Education
Education Research

Kasneci, E., Sessler, K., Kuchemann, S., et al. (2023). "ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education." Learning and Individual Differences, 103, 102274

Provides a balanced academic analysis of LLM opportunities and risks in education, covering personalization, assessment, accessibility, and academic integrity concerns. Draws on educational psychology research to evaluate AI tutoring approaches. Important for educators and policymakers shaping AI education policy.
Education Research
Legal AI

Choi, J.H., Hickman, K.E., Monahan, A., & Schwarcz, D. (2023). "ChatGPT Goes to Law School." SSRN 4335905

Evaluates ChatGPT's performance on law school exams, finding it passed all courses but performed at the bottom of the class. Provides nuanced analysis of where LLMs excel (issue spotting, organization) and struggle (nuanced legal reasoning) in legal contexts. Essential for understanding AI capabilities and limits in legal education.
Legal AI
Productivity Research

Noy, S. & Zhang, W. (2023). "Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence." Science, 381(6654), 187-192

A rigorous randomized experiment showing that ChatGPT access increased professional writing productivity by 40% and reduced quality inequality between workers. Published in Science, this is the gold standard for measuring LLM productivity impact. Critical for business leaders quantifying AI ROI.
Productivity Research