New Java BigData Libraries 2026
last commit 7 hours ago apache/beam 8K +11
added 5 months ago
Apache Beam is a unified programming model for Batch and Streaming data processing.
last commit 11 months ago alluxio/alluxio 7K +4
added 9 months ago
Alluxio Open Source (formerly known as Tachyon) is a Distributed Caching Platform for large-scale data.
last commit 11 hours ago jamesmudd/jhdf 169 +1
added 9 months ago
A pure Java HDF5 library
last commit 1 day ago apache/hadoop 15K +5
added 1 year ago
Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
last commit 1 day ago apache/spark 43K +58
added 1 year ago
Apache Spark - A unified analytics engine for large-scale data processing.
Java AI / ML Libraries BigData Libraries Data Platforms DataFrame Libraries
last commit 1 day ago apache/flink 25K +33
added 1 year ago
A stream processing framework with powerful stream- and batch-processing capabilities.
Java Data Platforms BigData Libraries Batch Processing Libraries