Vortex

🌪️ Vortex

📚 Documentation | 📊 Performance Benchmarks

Overview

Vortex is a next-generation columnar file format and toolkit designed for high-performance data analytics. It provides:

⚡️ Blazing Fast Performance
- 100-200x faster random access reads than Apache Parquet
- 2-10x faster scans with similar compression ratios and write throughput
- Efficient support for wide tables with zero-copy/zero-parse metadata
🔧 Extensible Architecture
- Modeled after Apache DataFusion's extensible approach
- Pluggable encoding system
- Zero-copy compatibility with Apache Arrow

Project Information

License

Licensed under the Apache License, Version 2.0

Governance

Vortex is an independent open-source project and not controlled by any single company. The Vortex Project is a sub-project of the Linux Foundation Projects. The governance model is documented in CONTRIBUTING.md and is subject to the terms of the Technical Charter.

Contributing

See CONTRIBUTING.md for guidelines.

Acknowledgments 🏆

This project builds upon groundbreaking work from the academic and open-source communities:

Key Research Papers

BtrBlocks - Efficient columnar compression
FastLanes - High-performance integer compression
FSST - Fast random access string compression
ALP - Adaptive lossless floating-point compression
Procella - YouTube's unified data system
Cloud Object Storage Analytics - High-performance analytics
ClickHouse - Fast analytics for everyone

Open Source Inspiration

Trademarks

Thanks to all contributors who have shared their knowledge and code with the community! 🚀