[metrics] Materialize all metrics tags into top level columns by mattmkim · Pull Request #6237 · quickwit-oss/quickwit
alexanderbianchi added a commit to alexanderbianchi/quickwit that referenced this pull request
… crate) Adds a DataFusion-based query execution layer on top of the wide-schema parquet metrics pipeline from PR quickwit-oss#6237. - New crate `quickwit-datafusion` with pluggable `QuickwitDataSource` trait, `DataFusionSessionBuilder`, `QuickwitSchemaProvider`, and distributed worker session setup via datafusion-distributed WorkerService - `MetricsDataSource` implements `QuickwitDataSource` for OSS parquet splits: metastore-backed split discovery, object-store caching, filter pushdown with CAST unwrapping fix, Substrait `ReadRel` consumption - `DataFusionService` gRPC (ExecuteSql + ExecuteSubstrait streaming) wired into quickwit-serve alongside the existing searcher and OTLP services - Distributed execution: DistributedPhysicalOptimizerRule produces PartitionIsolatorExec tasks (not shuffles) across multiple searcher nodes - Integration tests covering: pruning, aggregation, time range, GROUP BY, distributed tasks, NULL column fill for missing parquet fields, Substrait named-table queries, rollup from file, partial schema projection - dev/ local cluster setup with start-cluster, ingest-metrics, query-metrics Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>