“OpenAI Preparedness Framework 2.0” by Zvi

Right before releasing o3, OpenAI updated its Preparedness Framework to 2.0.

I previously wrote an analysis of the Preparedness Framework 1.0. I still stand by essentially everything I wrote in that analysis, which I reread to prepare before reading the 2.0 framework. If you want to dive deep, I recommend starting there, as this post will focus on changes from 1.0 to 2.0.

As always, I thank OpenAI for the document, and laying out their approach and plans.

I have several fundamental disagreements with the thinking behind this document.

In particular:

The Preparedness Framework only applies to specific named and measurable things that might go wrong. It requires identification of a particular threat model that is all of: Plausible, measurable, severe, net new and (instantaneous or irremediable). The Preparedness Framework thinks ‘ordinary’ mitigation defense-in-depth strategies will be sufficient to handle High-level threats and likely even Critical-level [...]

---

Outline:

(02:05) Persuaded to Not Worry About It

(08:55) The Medium Place

(10:40) Thresholds and Adjustments

(16:08) Release the Kraken Anyway, We Took Precautions

(20:16) Misaligned!

(23:47) The Safeguarding Process

(26:43) But Mom, Everyone Is Doing It

(29:36) Mission Critical

(30:37) Research Areas

(32:26) Long-Range Autonomy