Scaling Laws For Scalable Oversight

Apr 25, 2025ยท
David Baek
David Baek
,
Joshua Engels
,
Subhash Kantamneni
,
Max Tegmark
ยท 1 min read
Image credit: Unsplash
Abstract
For the first time, we propose a framework that quantifies the probability of successful oversight as a function of the capabilities of the overseer and the system being overseen. We also find scaling laws in four different oversight games that approximate how domain performance depends on general AI system capability.
Type
Publication
In NeurIPS 2025 (Under Review)
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.