Data Engineer

Who this track is for

The Data Engineer track is for engineers who want to build the infrastructure that powers data teams. You’ll learn the modern data stack end to end — from ingestion and transformation to streaming and governance. Best fit if you:

Come from a software engineering or analytics background
Want to build scalable pipelines, warehouses, and real-time systems
Are interested in the infrastructure that makes ML and analytics possible

Curriculum

Level 1 — Modern Data Stack Foundations (free)

Module	Key Topics
Data Warehousing	Dimensional modelling, star schema, Snowflake/BigQuery
dbt Fundamentals	Models, tests, documentation, lineage

Level 2 — Pipeline Engineering

Module	Key Topics
Orchestration	Airflow DAGs, task dependencies, backfill, SLAs
Ingestion Patterns	CDC, batch vs streaming, Airbyte, Fivetran patterns
Advanced dbt	Incremental models, snapshots, macros, packages

Level 3 — Distributed Computing

Module	Key Topics
Spark Fundamentals	RDDs, DataFrames, partitioning, optimisation
Spark in Production	Cluster management, cost optimisation, debugging

Level 4 — Streaming & Lakehouses

Module	Key Topics
Kafka & Streaming	Producers, consumers, Kafka Streams, exactly-once
Lakehouse Architecture	Delta Lake, Iceberg, data contracts, governance
Data Quality	Great Expectations, anomaly detection, alerting

Level 5 — Capstone

Design and build a complete data platform: ingestion, transformation (dbt), orchestration (Airflow), and a downstream analytics or ML use case. Reviewed by your cohort on architecture and data quality.

Get Started

The Platform

Career Tracks

Data Engineer

Who this track is for

Curriculum

Level 1 — Modern Data Stack Foundations (free)

Level 2 — Pipeline Engineering

Level 3 — Distributed Computing

Level 4 — Streaming & Lakehouses

Level 5 — Capstone

Get Started

The Platform

Career Tracks

​Who this track is for

​Curriculum

​Level 1 — Modern Data Stack Foundations (free)

​Level 2 — Pipeline Engineering

​Level 3 — Distributed Computing

​Level 4 — Streaming & Lakehouses

​Level 5 — Capstone

Who this track is for

Curriculum

Level 1 — Modern Data Stack Foundations (free)

Level 2 — Pipeline Engineering

Level 3 — Distributed Computing

Level 4 — Streaming & Lakehouses

Level 5 — Capstone