Build software better, together

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 65,783 public repositories matching this topic...

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  • Updated Apr 1, 2026
  • Python

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

  • Updated Apr 1, 2026
  • Python

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

  • Updated Apr 1, 2026
  • Python

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

  • Updated Oct 15, 2023
  • Python

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Mar 20, 2024
  • Python
applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

  • Updated Jul 18, 2024