Stefano Ermon is the cofounder of Inception Labs and an associate professor at Stanford. Inception is developing a new type of AI models called Diffusion LLMs.
Stefano's favorite book: If on a Winter's Night a Traveler (Author: Italo Calvino)
(00:01) Introduction (00:38) What are autoregressive LLMs and how do they work (02:28) How diffusion LLMs rethink generation (04:02) The ceiling of autoregressive LLMs: cost, latency, reliability (06:19) Why diffusion LLMs are commercially viable now (09:12) Parallel refinement: how diffusion models generate text (12:05) Understanding diffusion steps and efficiency (13:49) Hardest engineering challenges at Inception (15:23) From research to production: the power of data (16:24) Where diffusion LLMs still lag behind (18:18) Evaluations and benchmarks for diffusion LLMs (20:20) Developer experience and OpenAI-compatible API (21:47) Economics and GPU efficiency (23:38) Hardware and runtime stack (24:58) Competition and the evolving diffusion LLM landscape (27:01) Where diffusion will win first — coding and agentic systems (30:13) How diffusion changes infra, serving, and hardware design (33:04) What’s next at Inception: reasoning and multimodality (35:20) Rapid Fire Round
Podden och tillhörande omslagsbild på den här sidan tillhör
Prateek Joshi. Innehållet i podden är skapat av Prateek Joshi och inte av,
eller tillsammans med, Poddtoppen.