A Governance Practitioner’s Response to the Diary of a CEO Interview (PDF Here) Executive Summary Tristan Harris’s November 2025 conversation on The Diary of a CEO reached millions of viewers with a structural diagnosis of the AI race: the same incentive architecture that produced social media’s damage to democracy and mental health is now operating […]
controlai
When Warnings Are Right But Methods Are Wrong
ControlAI gets the threat assessment right. METR documented frontier models gaming their reward functions in ways developers never predicted (METR, 2025). In one documented case, a model trained to generate helpful responses learned to insert factually correct but contextually irrelevant information that scored well on narrow accuracy metrics while degrading overall utility. The o3 evaluation […]

