LLMs in Government & Public Sector

Constituent services, regulatory drafting, FOIA processing, benefits eligibility, fraud detection. Procurement, accountability, and the unique constraints of building AI for the public.

Chapter opener illustration: LLMs in Government & Public Sector.

"Government LLMs serve the citizens who pay for them, not the vendor who built them."

CompassCompass, Public-Interest AI Agent
Looking Back

Chapter 71 covered cybersecurity. This chapter covers government: public-sector deployments, defense, civic services, accessibility, FOIA and records management, and the procurement, transparency, and oversight requirements that public-sector AI now carries.

Big Picture

Government AI sits in a peculiar position: enormous potential value (every form, every benefits determination, every regulation touches millions of people) and uniquely high constraints (administrative-law due process, public-records obligations, procurement timelines longer than a model's shelf life). Successful public-sector LLM deployments in 2025-2026 share a common shape: narrow scope, conservative model choice, aggressive human-in-the-loop, and explicit accountability for who decided what when something goes wrong. Pilots that ignored any of those four invariably ended up in the news.

The U.S. federal anchor is OpenAI's ChatGPT Gov (Jan 2025), a tenant designed for U.S. federal employees, alongside the GSA's GovGPT pilot for procurement-Q&A. The UK anchor is the GDS GOV.UK Chat pilot (2024), which tested a grounded assistant against guidance pages. A concrete deployment-without-headlines example is the IRS Direct File pilot (2024), where LLM-assisted tax-code lookups stayed strictly in support, not adjudication, of taxpayer decisions: the boundary that keeps administrative-law due-process objections at bay.

Section 72.1 walks through the use cases that ship. Section 72.2 covers the failure modes. Section 72.3 covers OMB M-24-10, FedRAMP, and the broader regulatory framework. Section 72.4 walks through the public-sector grounded-assistant architecture. Section 72.5 closes with vendors and postmortems.

Chapter Overview

Government LLM deployment carries unique constraints: due-process obligations, FedRAMP and Section 508 compliance, public-records exposure, accountability under FOIA, and the procurement-versus-model-clock mismatch. This chapter walks the use cases that actually work (constituent service triage, FOIA, regulatory drafting, benefits pre-screening, fraud detection, knowledge search), the failure modes specific to government (NYC MyCity, due-process problems, accessibility failures), the regulatory framework (OMB M-24-10, FedRAMP, Section 508, EU AI Act, state inventory laws, NIST AI RMF), the public-sector grounded-assistant architecture (strict-scope retrieval, citations always, refusal by default, audit log, accessibility-first), and the vendor landscape plus the postmortems from NYC MyCity, Michigan MiDAS, Dutch SyRI, and Australian Robodebt.

Government AI is the industry where transparency and accountability are not optional. This chapter teaches what works, what fails, and what the procurement, civil-rights, and accessibility rules actually require.

Note: Learning Objectives

Sections in This Chapter

Prerequisites

What Comes Next

Government produced the strict-grounded-retrieval pattern that, in the next chapter, takes the parallel form of OT-isolation in manufacturing. Chapter 73 covers the industry where the human-in-the-loop boundary is between IT and OT zones rather than between administrative-law decisions.

Further Reading

Public-Sector AI Governance

National Institute of Standards and Technology (2024). "Artificial Intelligence Risk Management Framework: Generative AI Profile (NIST AI 600-1)." NIST. NIST AI 600-1. The U.S. federal reference framework that government LLM deployments must map to, the regulatory baseline for the strict-grounded-retrieval pattern.
European Parliament & Council of the European Union (2024). "Regulation (EU) 2024/1689 (Artificial Intelligence Act)." Official Journal of the European Union. EUR-Lex. The EU AI Act, which classifies most public-sector LLM uses as high-risk, the regulatory force behind the audit and explainability requirements in this chapter.

Algorithmic Decisions & Postmortems

Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin's Press. Macmillan. The reference field study (Indiana Eligibility, LA homeless triage, Allegheny CYF) on what fails when administrative decisions are automated, the canonical reading list item for any government-AI program manager.
Henman, P. (2022). "Improving Public Services Using Artificial Intelligence: Possibilities, Pitfalls, Governance." Asia Pacific Journal of Public Administration. Taylor & Francis. An academic synthesis of public-sector AI lessons from Robodebt, SyRI, and MiDAS, which the postmortem callouts in this chapter draw on.