BenchMark For TsFile multi-language
This repo compares TsFile and Parquet using the same config and metrics (prepare/write/read time, throughput, file size, memory).
Project structure
benchmark_core/
├── conf.json # Shared config (tablet_num, tag1_num, tag2_num, timestamp_per_tag, field_type_vector)
├── tsfile/ # TsFile benchmarks
│ ├── java/
│ ├── python/
│ └── cpp/
└── parquet/ # Parquet benchmarks
├── java/
├── python/
└── cpp/
Output
Results under /result (or result/ when run locally):
| Format | Language | Summary JSON | Memory CSV |
|---|---|---|---|
| TsFile | Java | results_java.json |
memory_usage_java.csv |
| TsFile | Python | results_python.json |
memory_usage_python.csv |
| TsFile | C++ | (console) | memory_usage_cpp.csv |
| Parquet | Java | results_parquet_java.json |
memory_usage_parquet_java.csv |
| Parquet | Python | results_parquet_python.json |
memory_usage_parquet_python.csv |
| Parquet | C++ | results_parquet_cpp.json |
memory_usage_parquet_cpp.csv |
Each JSON contains: tsfile_size (KB), prepare_time, writing_time, writing_speed, reading_time, reading_speed.
Profile output files (for Python and C++): Flamegraph-related profiling data will be generated to assist with performance analysis.
Requirements
- Python:
tqdm,psutil,pyarrow(for Parquet Python) - Java: Maven (for TsFile and Parquet Java)
- C++: CMake, TsFile C++ (for TsFile C++), Arrow C++ / Parquet C++ (for Parquet C++)
Benchmark Execution Workflow
-
Clone TsFile Source Code
Clone the latest TsFile repository from GitHub to your local machine.
-
Start Docker with Volume Mounts
Launch a Docker container and mount the following directories into the container:
- The cloned TsFile source code
- The TsFile-benchmark repository itself
- A result directory, which should be mounted to /result inside the container to store output files
-
Build and Run Benchmarks in Docker The Docker container will automatically:
- Compile and install the TsFile project
- Run performance benchmarks
- Collect profiling data using perf. Flamegraph results (for C++ and Python) will be generated and stored under the /result directory.