We explore how a third-party logistics platform built its entire data orchestration layer on Airflow, and what that makes possible for developer teams and merchant-facing products alike.


Filip Kunčar, Platform Director at ShipMonk Product Development, discusses migrating from a closed source tool to Airflow, orchestrating dbt with both Cosmos and the BashOperator and using Airflow to power customer-facing data delivery.


Key Takeaways:


00:00 Introduction.

01:07 ShipMonk is a third-party logistics company guaranteeing two-day delivery across the US. The data platform team's mission is to lower cognitive load for developers working with data. 

05:13 ShipMonk migrated to Airflow in 2022, moving away from a closed-source UI-based tool, driven by the need for a code-first approach, open source extensibility and broad cloud provider support. 

10:02 The team uses Cosmos for developer-facing visibility and lineage and BashOperator for internal pipelines where runtime performance matters. 

12:20 Switching from Cosmos to the BashOperator for a frequently running pipeline reduced runtime from over 15 minutes to three minutes. 

13:14 Because the full dbt chain runs inside Airflow, a configurable downstream DAG can deliver processed data directly to each merchant's preferred destination, with secrets management and SLA tracking already handled. 

15:03 Per-team alerting is hooked to each DAG by owner and severity, so teams can react to SLA breaches immediately. 

18:09 ShipMonk uses Airflow in three ways for AI: authoring DAGs faster with skills, orchestrating AI workloads in Lambda and containers and using Astronomer's skills repo to simplify Airflow version upgrades.


Resources Mentioned:


Filip Kunčar

https://www.linkedin.com/in/filipkuncar/


ShipMonk Product Development

https://www.linkedin.com/company/shipmonk-product-development/


ShipMonk | Website

http://www.shipmonk.com


Astronomer Cosmos

http://www.astronomer.io/cosmos


Astronomer AI Skills Repo

http://www.github.com/astronomer/airflow-llm-providers-demo


Datadog

http://www.datadoghq.com




Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.



#AI #Automation #Airflow #MachineLearning

Podden och tillhörande omslagsbild på den här sidan tillhör Astronomer. Innehållet i podden är skapat av Astronomer och inte av, eller tillsammans med, Poddtoppen.