"Sequent: scale and automation for higher confidence in alignment" by Geoffrey Irving, Alex HT, Jesse Hoogland, Daniel Murfet, Jacob Pfau, Marco Cozzi, Stan van Wingerden
Artificial superintelligence (ASI) may be developed in the next few years. It is unclear whether alignment is on track to be ready on the same timeframe. At a minimum, the empirical programs at AI labs are unlikely to deliver a priori confidence, before training ASI, that things will go well. We are starting a large nonprofit research organization, Sequent, that aims to clear a higher bar:
We are aiming at higher confidence via a portfolio of theory and empirics bets, all of which could fail, such that if any succeed, they would give us more a priori confidence in aligned outcomes.
We are investing heavily in automation to accelerate progress on these bets.
We believe that theory unlocks higher automation. Taking a more principled approach offers better filters for deciding which directions of automated research are promising (a proof is worth a thousand experiments, and even a pseudo-proof is worth hundreds).
Who[1]: researchers from the UK AISI's Alignment Team and Timaeus, with more to come. We’re aiming at 40-80 FTE two years from now. The Alignment Team ran the £30m Alignment Project, and Timaeus has pioneered applying singular learning theory (SLT) to alignment. [...]
---
Outline:
(00:21) Alignment is not on track
(02:40) Aiming at higher confidence
(05:30) Why a new big organization
(07:35) Different lines of research will interact
(11:35) Amortizing security and funding
(12:47) Automated alignment is possible, if not necessarily in time
(17:39) Federated structure to preserve research diversity
(18:38) Field building and broader alignment scale-up
(21:07) Independence is important
(22:40) Join us!
The original text contained 1 footnote which was omitted from this narration.
Podden och tillhörande omslagsbild på den här sidan tillhör
LessWrong. Innehållet i podden är skapat av LessWrong och inte av,
eller tillsammans med, Poddtoppen.