Overview
I built an ETL process in SQL using the Medallion Architecture (Bronze–Silver–Gold) to gather data from different sources, progressively clean and transform it, and load it into a data warehouse optimized for analysis and reporting.
Technologies Used
- SQL Server — main database and data warehouse
- T-SQL — data cleaning, transformation, and modeling
- SSIS — ETL orchestration and data pipeline automation
- Medallion Architecture (Bronze / Silver / Gold) — layered data modeling for ingestion, refinement, and analytics
Data Sources
- CRM: Salesforce, HubSpot
- ERP: SAP, Oracle NetSuite, Odoo
- Other Sources: Excel files, CSV imports, Web APIs, Third-party integrations
ETL Process (Medallion Architecture)
-
Bronze Layer (Raw):
- Extract data from multiple sources.
- Store raw, unprocessed data as-is.
- Preserve source structure for traceability and auditing.
-
Silver Layer (Cleaned & Conformed):
- Clean and standardize data.
- Handle nulls, duplicates, and invalid records.
- Apply business rules and data transformations.
-
Gold Layer (Analytics Ready):
- Build fact and dimension tables.
- Optimize data for reporting and analytics.
- Serve as the source for BI tools and dashboards.
Data Warehouse Design
- Fact Tables: sales
- Dimension Tables: products, customers
- Schema Type: Star Schema
Key Features
- Data validation and quality checks
- Incremental loading
- Optimized SQL queries for reporting
Outcome
- Centralized and reliable data for analytics
- Simplified reporting and insight generation
- Solid foundation for business intelligence dashboards