Summary
In this episode of the Data Engineering Podcast Lukas Schulte, co-founder and CEO of SDF, explores the development and capabilities of this fast and expressive SQL transformation tool. From its origins as a solution for addressing data privacy, governance, and quality concerns in modern data management, to its unique features like static analysis and type correctness, Lucas dives into what sets SDF apart from other tools like DBT and SQL Mesh. Tune in for insights on building a business around a developer tool, the importance of community and user experience in the data engineering ecosystem, and plans for future development, including supporting Python models and enhancing execution capabilities.
Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Imagine catching data issues before they snowball into bigger problems. That’s what Datafold’s new Monitors do. With automatic monitoring for cross-database data diffs, schema changes, key metrics, and custom data tests, you can catch discrepancies and anomalies in real time, right at the source. Whether it’s maintaining data integrity or preventing costly mistakes, Datafold Monitors give you the visibility and control you need to keep your entire data stack running smoothly. Want to stop issues before they hit production? Learn more at dataengineeringpodcast.com/datafold today!
  • Your host is Tobias Macey and today I'm interviewing Lukas Schulte about SDF, a fast and expressive SQL transformation tool that understands your schema

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what SDF is and the story behind it?
    • What's the story behind the name?
  • What problem are you solving with SDF?
    • dbt has been the dominant player for SQL-based transformations for several years, with other notable competition in the form of SQLMesh. Can you give an overview of the venn diagram for features and functionality across SDF, dbt and SQLMesh?
  • Can you describe the design and implementation of SDF?
    • How have the scope and goals of the project changed since you first started working on it?
  • What does the development experience look like for a team working with SDF?
    • How does that differ between the open and paid versions of the product?
  • What are the features and functionality that SDF offers to address intra- and inter-team collaboration?
  • One of the challenges for any second-mover technology with an established competitor is the adoption/migration path for teams who have already invested in the incumbent (dbt in this case). How are you addressing that barrier for SDF?
    • Beyond the core migration path of the direct functionality of the incumbent product is the amount of tooling and communal knowledge that grows up around that product. How are you thinking about that aspect of the current landscape?
  • What is your governing principle for what capabilities are in the open core and which go in the paid product?
  • What are the most interesting, innovative, or unexpected ways that you have seen SDF used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on SDF?
  • When is SDF the wrong choice?
  • What do you have planned for the future of SDF?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Podden och tillhörande omslagsbild på den här sidan tillhör Tobias Macey. Innehållet i podden är skapat av Tobias Macey och inte av, eller tillsammans med, Poddtoppen.