Scaling Laws For Scalable Oversight

Apr 25, 2025·

David Baek

Joshua Engels

Subhash Kantamneni

Max Tegmark

· 1 min read

PDF Cite Code Custom Link

Image credit: Unsplash

Abstract

For the first time, we propose a framework that quantifies the probability of successful oversight as a function of the capabilities of the overseer and the system being overseen. We also find scaling laws in four different oversight games that approximate how domain performance depends on general AI system capability.

Type

Conference paper

Publication

In NeurIPS 2025 (Spotlight)

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.