sqlar: add new backend for SQLite Archive files by acme · Pull Request #9265 · rclone/rclone

added 7 commits

March 17, 2026 19:21
Add a new backend that lets rclone read and write SQLite Archive (sqlar)
files. SQLite Archive stores compressed files inside a SQLite database,
compatible with the "sqlite3 -A" command-line tool.

https://www.sqlite.org/sqlar.html

The backend supports full read/write operations including copy, move,
delete, and purge. It compresses with zlib to stay compatible with
sqlite3's own sqlar_compress/decompress functions. Reads stream blobs
directly to keep memory usage reasonable, while a 256 MiB concurrency
cap bounds memory during writes. The backend preserves Unix file mode
and mtime as metadata and provides a "vacuum" command to reclaim space
after deletions.

The test suite checks compatibility against archives the sqlite3 CLI
creates.

Uses the ncruces/go-sqlite3 driver (WASM-based, no CGo), so it builds
on all platforms rclone supports without requiring CGo.
 The sqlite compatibility tests require sqlar support from the sqlite3
 binary. Older sqlite versions can be present in PATH but do not support
 the archive features these tests use.

 Check the sqlite version before running and skip the tests unless it is
 3.23.0 or newer. This avoids false test failures on systems with an
 older sqlite installed.
The sqlar compatibility tests require a sqlite3 binary that supports
archive mode (-A). In CI, some sqlite3 binaries are new enough but
still reject -Ac, -Ax, and -At, causing false failures.

Skip these tests unless sqlite3 is at least 3.23.0 and sqlite3 -help
shows archive mode support. This avoids spurious failures when PATH
contains an incompatible sqlite3 binary.
Previously each NewFs call opened a separate *sql.DB connection pool.
The fstests integration framework creates multiple Fs instances for the
same archive file (for subdirectory listing, purge, isFile tests, etc.)
without closing the extras. Since ncruces/go-sqlite3 instantiates a
full WASM module per connection, multiple pools with MaxOpenConns(4)
could spawn dozens of WASM instances, exhausting memory on Windows CI.

Introduce a reference-counted sharedDB cache keyed by file path so that
all Fs instances for the same archive reuse one *sql.DB. The underlying
database is only closed when the last reference calls Shutdown.
The ncruces/go-sqlite3 driver runs SQLite compiled to WebAssembly via
wazero. Each database/sql pooled connection spawns a separate WASM
instance whose linear memory is a Go []byte that can grow up to 4 GB.
With MaxOpenConns(4), peak memory could reach 16 GB, causing "fatal
error: out of memory" on Windows CI runners with ~7 GB RAM.

Two changes to reduce memory:

Remove the page_size(512) pragma. The tiny page size caused SQLite to
use 8x more pages than the default 4096, growing each WASM instance's
heap much faster.

Reduce MaxOpenConns from 4 to 2 and set MaxIdleConns to 1. This halves
the number of concurrent WASM instances and releases idle ones promptly.
Two connections still support one concurrent blob read alongside one
query or write, which is sufficient given the Fs.mu write serialization.
The sqlar backend uses ncruces/go-sqlite3 which runs SQLite compiled to
WebAssembly via wazero. By default wazero JIT-compiles the WASM module
to native code, which has higher peak memory usage. Combined with
go test ./... running many large test binaries in parallel, this was
enough to cause "fatal error: out of memory" on Windows GitHub Actions
runners (7 GB RAM), with VirtualAlloc failing at ~11 GB.

Add a TestMain that configures wazero to use the interpreter engine
instead of the JIT compiler for tests. The interpreter has lower peak
memory and is fast enough for the small files used in the integration
suite (1.6s vs 1.2s). Production use is unaffected and continues to
use the JIT compiler.

Also revert MaxOpenConns from 2 back to 4. Reducing the connection
pool did not help because the OOM was from compilation overhead, not
from the WASM instance memory (which is capped at 256 MB each).