Add MySQL metastore backend by saliagadotcom · Pull Request #6199 · quickwit-oss/quickwit

Add MySQL metastore backend

Description

Adds MySQL as a new production-grade metastore backend for Quickwit, enabling deployments that prefer MySQL over PostgreSQL for storing index metadata, splits, shards, delete tasks, and index templates.

The existing metastore has two backends: file-backed (dev) and PostgreSQL (production). This PR introduces a third: MySQL, following the same architecture and feature surface as the PostgreSQL backend.

MySQL metastore modules

The MySQL metastore lives in quickwit-metastore/src/metastore/mysql/ and mirrors the PostgreSQL metastore module-for-module:

Module Purpose
metastore.rs Full MetastoreService trait implementation (~2,158 lines) covering all 30+ metastore operations: index CRUD, split staging/publishing/deletion, shard open/acquire/prune, delete tasks, and index templates.
model.rs Row types (MysqlIndex, MysqlSplit, MysqlShard, MysqlDeleteTask, MysqlIndexTemplate) for SQLx row deserialization.
factory.rs MysqlMetastoreFactory — URI-based factory that creates a MysqlMetastore from a mysql:// connection string.
pool.rs TrackedPool<MySql> — connection pool wrapper with metrics tracking for active/idle connections.
migrator.rs Schema migration runner with configurable locking (skip via QW_MYSQL_SKIP_MIGRATION_LOCKING).
error.rs SQLx error → MetastoreError conversion, mapping MySQL error codes to domain errors.
tags.rs Tag filter → SQL WHERE clause generation using sea_query.
utils.rs Query building helpers: split filters, maturity timestamps, column selection, ordering.
split_stream.rs Streaming split results in chunks (STREAM_SPLITS_CHUNK_SIZE).
metrics.rs MySQL-specific metastore operation metrics.
auth.rs New (not in Postgres) — AWS IAM authentication token generation for RDS MySQL.
queries/ Raw SQL files for shard and index template operations using MySQL syntax (ON DUPLICATE KEY UPDATE).

Protocol & type layer changes (quickwit-common, quickwit-proto)

Change Why
Added Protocol::MySQL variant with mysql:// URI parsing Enables mysql://user:pass@host:3306/db metastore URIs with credential redaction in logs.
is_database() now covers Protocol::MySQL Prevents invalid operations like join() and parent() on database URIs.
Added sqlx::Type<MySql> + sqlx::Encode<MySql> for IndexUid, DocMappingUid, ShardId SQLx requires per-database-driver trait impls — these cannot be shared with the Postgres impls (see rationale below).
Broadened #[cfg(feature = "postgres")]#[cfg(any(feature = "postgres", feature = "mysql"))] Shared code like sea_query::Value conversions and TryFrom<String> impls are needed by both backends.

Configuration (quickwit-config)

Change Why
MetastoreBackend::MySQL enum variant Enables metastore_backend: mysql in config files.
MetastoreConfig::MySQL(MysqlMetastoreConfig) Holds connection pool settings (min_connections, max_connections, acquire_timeout, idle_timeout, max_lifetime).
MysqlAuthMode enum (Password, AwsIam) Supports both password-based and IAM token-based authentication for AWS RDS.

Schema (migrations/mysql/)

1_initial-schema.up.sql creates six tables adapted for MySQL's DDL dialect:

  • indexes — index metadata with index_metadata_json column (TEXT)
  • splits — split lifecycle management with BIGINT timestamps, split_metadata_json (TEXT), and JSON tag columns
  • shards — shard state tracking with publish_position_inclusive (VARCHAR(255))
  • delete_tasks — delete query queue with delete_query_json (TEXT)
  • index_templates — index template storage with index_template_json (TEXT) and priority-based matching
  • kv — key-value metadata storage

Resolver & feature gating

  • MysqlMetastoreFactory registered in MetastoreResolver behind #[cfg(feature = "mysql")]
  • UnsupportedMetastore fallback when compiled without the feature (matches Postgres pattern)
  • mysql feature added to quickwit-metastore, quickwit-proto, and quickwit-cli Cargo.toml

Docker & testing infrastructure

  • MySQL 8.0 service added to docker-compose.yml with health checks
  • .env.example updated with QW_MYSQL_* connection variables

How was this tested?

  • MySQL 8.0 Docker service with health checks for local testing
  • Unit tests added for MySQL-specific functionality
  • Migration up/down scripts verified against fresh MySQL instances
  • URI parsing tests for mysql:// protocol including credential redaction
  • Feature-gated compilation verified with and without the mysql feature
  • Currently running in our K8s env connected to RDS

image

Couldn't find any contribution rules about LLM usage but warning for substantial LLM usage in this PR!