GitHub - dseeni/Coroutine_Multi-Tasking_Data_Parser: Cooperative Multitasking Using Generator Based Coroutines in Python...

  • Coroutine Based Data Processing Pipeline:
    • Concurrency at every stage of the data processing pipeline...

    • Process any number of files, of any length, for any number of filters

    • Multitask across all input files, broadcasting every row concurrently

    • Evaluate each row against any number of filters per input file

    • Dynamic data type inference and concurrent data parsing

    • Robust error handling of potentially unclean input data

    • Multiple date formats supported via user defined format key library

    • Generate user defined named tuples classes for each row processed

    • Multi-file output supports custom output directory and file names