← Back to Projects Data Warehousing
ETL Orchestration Pipeline
Airflow-based ETL pipeline with multi-source extraction, dbt transformations on DuckDB warehouse, SCD Type 2 snapshots, data quality checks, and lineage tracking.
Overview
Production ETL pipeline with orchestration, transformation, and data quality monitoring.
Architecture
- Apache Airflow for DAG orchestration
- Multi-source extraction (CSV, REST API, SQLite)
- dbt transformations on DuckDB
- SCD Type 2 slowly changing dimensions
- Data quality checks and lineage tracking
Key Features
- Automated pipeline scheduling
- Data quality validation gates
- Lineage tracking and documentation
- SCD Type 2 historical snapshots