[1] We will likely have near-superhuman mathematics AI by Q1 2027.
[1]
[2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities.
[2]
[3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself.
[4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital.
[5] My understanding is that basically no one
[3]
is working on building infra that can usefully turn >100 million dollars of compute credits into safety-relevant mathematical output.
[5.1] The number of theory-driven ASI alignment efforts is also comparatively miniscule. ARC is a much better bet now than it was in 2023.
[5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, [metaphilosophy] (https://www.alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy). This is a much harder problem.
[6] In worlds where alignment is easy, prosaic methods may [...]
The original text contained 3 footnotes which were omitted from this narration.
Podden och tillhörande omslagsbild på den här sidan tillhör
LessWrong. Innehållet i podden är skapat av LessWrong och inte av,
eller tillsammans med, Poddtoppen.