The Completion Illusion: Why AI Agents Overclaim Done, and the Case for an Agent Control Tower

Authors: Han Kim
Papers: IOV Labs · open study · 10pp · 2026-06-15

Abstract

As language-model agents take on multi-step work, the systems around them increasingly trust the agent's own report that a task is finished. We test whether that report is true. Across 896 verifiable micro-task instances spanning four models and two capability tiers, agents self-report a perfect score on every run while actual accuracy ranges from 86 to 96 percent. The false-completion rate is capability-tiered: small cheap models overclaim by about 13 percent, frontier models are nearly calibrated. The certified errors concentrate in character-level tasks (62 to 78 percent) while arithmetic is perfect. A managed register, do one-by-one, then re-check protocol does not fix it: a model cannot reliably audit its own completion, and asking it to re-applies the same blind spot. We also report an honest null: the protocol does not improve task accuracy or reduce omission on current models. The implication is structural. Completion cannot be trusted from the model; it must be verified at the system layer. We connect this to the emerging agent control tower pattern, where a board, a calendar, and a server-enforced workflow externalize an agent's state and gate its transitions, place it on a maturity ladder, and argue that the open frontier, and the moat, is verified completion: turning done from a claim into evidence.

Keywords

AI agents
task management
control plane
MCP
self-report
verification
principal-agent
Goodhart
agent control tower
reproducibility

Download PDF GitHub (run it)Research note