datAnir - Overview
Popular repositories Loading
-
Spark-Programming-In-Python Spark-Programming-In-Python Public
Forked from LearningJournal/Spark-Programming-In-Python
Apache Spark 3 - Spark Programming in Python for Beginners
Python
-
deequ deequ Public
Forked from awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Scala
-
system-design-primer system-design-primer Public
Forked from donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Python
-
Skytrax-Data-Warehouse Skytrax-Data-Warehouse Public
Forked from iam-mhaseeb/Skytrax-Data-Warehouse
A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data…
Python
-
coding-challenge-pyspark coding-challenge-pyspark Public
Forked from rubenwap/coding-challenge-pyspark
Small test to learn how to use pyspark
Python