cwensel - Overview

Pinned Loading

  1. Clusterless is a tool for scheduling decentralized, scalable, and secure data pipelines for continuously arriving data, across clouds.

    Java 15

  2. A data engineering cli for reading and writing data to/from multiple locations across multiple formats.

    Java 9

  3. Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.

    Java 353 219

  4. A CLI for diffing datasets

    Java 7

  5. A declarative API for batch processing schema-less nested data types like JSON

    Java 3 1

  6. Small simple parsers for data cleansing or command line argument parsing

    Java 1