Start Data Engineering - Actionable Data Engineering Tutorials

2026-04-04 3 Data Storage Techniques Every Data Engineer Should Know 2026-03-28 4 Data Engineering Concepts To Land A High-Paying Data Engineering Job 2026-03-07 How to Implement Data Quality Checks in Python Without Third-Party Tools 2026-02-15 Free Airflow 3.0 Tutorial 2026-02-08 Use Given/When/Then Specs to Make AI Generate Production-Ready Pipelines, Not Spaghetti Code 2026-01-18 Python Notebooks in Production: How marimo Solves Jupyter’s Biggest Problems for Software Engineers 2026-01-17 Demonstrate Python Expertise by Building Libraries: From Architecture to Published Package 2026-01-10 How to Write Integration Tests for Python Data Pipelines 2026-01-03 How to Create Python Data Pipelines by Defining Architecture and Generating Code with LLMs 2025-08-13 How to Use Spark SQL Merge Into - Step-by-Step Tutorial 2025-08-12 Six Data Modeling Techniques For Building Production-Ready Tables Fast 2025-08-11 Free 10-Minute Polars Tutorial for Data Engineers 2025-08-10 Free Python Standard Library How-to Cheatsheet for Data Engineers 2025-08-09 How to Get Really Good at Advanced SQL for Data Engineering 2025-08-05 How to quickly set up a local Spark development environment? 2025-06-10 Using Joins and Group Bys the right way for data warehousing 2025-06-07 CTEs(Common Table Expression) or Temporary Tables for Spark SQL 2025-06-03 Advanced SQL is knowing how to model the data & get there effectively 2025-05-05 Data Engineering Interview Preparation Series #3: SQL 2025-04-14 How to Extract Data from APIs for Data Pipelines using Python 2025-04-05 How to create an SCD2 Table using MERGE INTO with Spark & Iceberg 2025-03-18 How to quickly deliver data to business users? #1. Adv Data types & Schema evolution 2025-03-01 How to Manage Upstream Schema Changes in Data Driven Fast Moving Company 2025-02-16 Visual Studio Code (VSCode) extensions for data engineers 2025-02-10 Should Data Pipelines in Python be Function based or Object-Oriented (OOP)? 2025-02-03 How to turn a 1000-line messy SQL into a modular, & easy-to-maintain data pipeline? 2025-01-28 How to ensure consistent metrics in your warehouse 2025-01-20 Data Engineering Interview Preparation Series #2: System Design 2024-12-18 How to reference a seed from a different dbt project? 2024-11-22 What do Snowflake, Databricks, Redshift, BigQuery actually do? 2024-10-17 25 SQL tips to level up your data engineering skills 2024-10-14 How to use nested data types effectively in SQL 2024-09-23 How to decide on a data project for your portfolio 2024-09-18 How to build a data project with step-by-step instructions 2024-09-05 What are the Key Parts of Data Engineering? 2024-08-13 Data Engineering Interview Preparation Series #1: Data Structures and Algorithms 2024-07-26 How to implement data quality checks with greatexpectations 2024-07-16 What are the types of data quality checks? 2024-07-01 SQL or Python for Data Transformations? 2024-06-24 Why use Apache Airflow (or any orchestrator)? 2024-06-14 Data Engineering Projects 2024-06-12 Data Engineering Project for Beginners - Batch edition 2024-06-11 Build Data Engineering Projects, with Free Template 2024-05-30 Python Essentials for Data Engineers 2024-05-29 dbt(Data Build Tool) Tutorial 2024-05-28 Building Cost Efficient Data Pipelines with Python & DuckDB 2024-05-21 Enable stakeholder data access with Text-to-SQL RAGs 2024-05-09 How to reduce your Snowflake cost 2024-04-22 How to test PySpark code with pytest 2024-04-22 Docker Fundamentals for Data Engineers 2024-02-22 Data Engineering Best Practices - #2. Metadata & Logging 2023-12-13 Uplevel your dbt workflow with these tools and techniques 2023-11-14 What is an Open Table Format? & Why to use one? 2023-10-25 6 Steps to Avoid Messy Data in Your Warehouse 2023-07-20 Data Engineering Best Practices - #1. Data flow & Code 2023-06-30 What is a self-serve data platform & how to build one 2023-06-13 How to become a valuable data engineer 2023-05-15 Data Engineering Project: Stream Edition 2023-02-15 Change Data Capture, with Debezium 2023-01-12 Data Pipeline Design Patterns - #2. Coding patterns in Python 2022-12-11 Data Pipeline Design Patterns - #1. Data flow patterns 2022-08-11 How to gather requirements for your data project 2022-06-24 5 Steps to land a high paying data engineering job 2022-05-18 Setting up a local development environment for python data projects using Docker 2022-04-12 What is the difference between a data lake and a data warehouse? 2022-03-18 End-to-end data engineering project - batch edition 2022-02-22 Automating data testing with CI pipelines, using Github Actions 2021-12-12 How to choose the right tools for your data pipeline 2021-11-11 Setting up end-to-end tests for cloud data pipelines 2021-10-22 How to improve at SQL as a data engineer 2021-10-12 6 Responsibilities of a Data Engineer 2021-10-12 6 Key Concepts, to Master Window Functions 2021-10-12 Whats the difference between ETL & ELT? 2021-10-12 What are Common Table Expressions(CTEs) and when to use them? 2021-10-12 How to add tests to your data pipelines 2021-10-11 10 Skills to Ace Your Data Engineering Interviews 2021-10-05 What is a staging area? 2021-10-03 What is a Data Warehouse? 2021-09-16 How to Scale Your Data Pipelines 2021-08-29 Understand & Deliver on Your Data Engineering Task 2021-08-17 4 Key Patterns to Load Data Into A Data Warehouse 2021-07-21 How to Validate Datatypes in Python 2021-06-25 Designing a Data Project to Impress Hiring Managers 2021-05-13 How to make data pipelines idempotent 2021-04-26 Writing memory efficient data pipelines in Python 2021-04-08 How to gather requirements to re-engineer a legacy data pipeline 2021-03-27 How to trigger a spark job from AWS Lambda 2021-02-28 How to set up a dbt data-ops workflow, using dbt cloud and Snowflake 2021-02-13 Apache Superset Tutorial 2021-02-07 How to Join a fact and a type 2 dimension (SCD2) table 2021-01-30 How to update millions of records in MySQL? 2021-01-16 How to unit test sql transforms in dbt 2021-01-06 How to Backfill a SQL query using Apache Airflow 2021-01-01 How to do Change Data Capture (CDC), using Singer 2020-11-08 How to Pull Data from an API, Using AWS Lambda 2020-10-12 How to submit Spark jobs to EMR cluster from Airflow 2020-07-26 Ensuring Data Quality, With Great Expectations 2020-07-11 Designing a “low-effort” ELT system, using stitch and dbt 2020-06-19 3 Key techniques, to optimize your Apache Spark code 2020-06-11 What, why, when to use Apache Kafka, with an example 2020-06-02 A proven approach to land a Data Engineering job 2020-05-02 What Does It Mean for a Column to Be Indexed 2020-04-25 Advantages of Using dbt(Data Build Tool) 2020-04-18 Apache Airflow Review: the good, the bad 2020-04-11 Review: Building a Real Time Data Warehouse 2020-04-05 3 Key Points to Help You Partition Late Arriving Events 2020-03-29 Scheduling a SQL script, using Apache Airflow, with an example 2020-03-20 10 Key skills, to help you become a data engineer