Benny Chen is the cofounder of Fireworks AI, an AI infrastructure platform. They have raised $327M in funding from Benchmark, Sequoia, Lightspeed, Index, and others.
Benny's favorite book: Principles (Author: Ray Dalio)
(00:01) Intro and why AI infrastructure is having a moment (00:06) Training vs inference: what’s working and where the real bottlenecks are (01:25) Why inference is the hard problem in production (03:30) What breaks at scale when AI systems hit real users (05:29) GPUs, hardware constraints, and why power is now a first-class concern (06:02) What you’re actually paying for in inference (07:21) Reliability, compliance, and enterprise expectations (09:49) Training and inference capacity: when they blur together (11:06) How to make inference fast in practice (13:06) System design choices behind modern inference platforms (15:28) Inference economics and cost tradeoffs (18:02) When fine-tuning actually makes sense (21:58) What “best model” really means for real companies (24:25) Production LLM architectures that actually work (27:46) Building an AI infra company customers can trust (29:27) Shipping fast without breaking reliability (31:14) Go-to-market lessons for infra startups (34:17) Where inference platforms are heading next (36:32) Rapid fire round
Podden och tillhörande omslagsbild på den här sidan tillhör
Prateek Joshi. Innehållet i podden är skapat av Prateek Joshi och inte av,
eller tillsammans med, Poddtoppen.