Sumit3301 - Overview

Sumit Gaur

Data Engineer | Apache Spark | Databricks | Cloud Data Platforms

Designing scalable data pipelines and distributed data systems


Professional Summary

Data Engineer with hands-on experience in building scalable ETL/ELT pipelines, optimizing distributed data processing using Apache Spark, and deploying production-grade workflows on Databricks and cloud platforms.

Strong foundation in data modeling, performance tuning, and automation.


Core Competencies

Data Engineering

  • Apache Spark (PySpark)
  • Databricks
  • Delta Lake
  • ETL / ELT pipeline development
  • Data modeling
  • SQL & performance optimization

Cloud & Infrastructure

  • Google Cloud Platform (GCP)
  • Docker
  • CI/CD pipelines
  • Linux
  • Git

Programming

  • Python
  • SQL
  • Bash scripting

Machine Learning (Applied)

  • Scikit-learn
  • TensorFlow
  • PyTorch

Certifications

  • Databricks Certified Data Engineer Associate
  • Databricks Certified Data Engineer Professional

Publications

Technical articles:
https://medium.com/@sumit.gaur1999


Contact

Email: sumit.gaur1999@gmail.com
LinkedIn: https://linkedin.com/in/sumit-gaur-301a50186/
Twitter: https://twitter.com/sumithere3301


GitHub Statistics