An experimental embedded key-value database written in Zig, built around page-based storage, sorted leaf-page indexing, WAL record logging, and a small transaction API.
Table of Contents
- Project Status
- Features
- Architecture
- Download
- Quick Start
- API Usage
- CLI Tool
- Building and Testing
- Project Structure
- Configuration
- Performance Considerations
- License
Project Status
KVDB is currently best viewed as an early-stage embedded storage engine.
What works today:
- persistent page-based storage with CRUD operations
- multi-page B-tree insert/search/iteration support
- WAL record writing with CRC32 checksums
- explicit transaction APIs plus compaction and CLI tooling
- basic automated tests and CI
Important current limitations:
- multi-page deletes work only while each touched non-root leaf keeps at least one entry; borrow/merge/root-shrink are not implemented yet
- automatic startup recovery replays committed WAL records during
Database.open - rollback is implemented by reloading the database file, not by targeted undo
- compression is not implemented
Features
Implemented Today
- Page-Based Storage: 4KB fixed-size pages optimized for simple file I/O
- Sorted Leaf-Page Indexing: Keys are kept sorted inside the current root leaf page and searched with binary search
- Basic Transaction API: One active transaction per database with explicit
commit()/abort()methods - WAL Record Logging: Insert/delete/commit/abort records include CRC32 checksums
- In-Memory Page Cache: Reduces repeated disk reads for loaded pages
- Database Compaction: Rebuilds the database into a fresh file, validates logical contents, then atomically replaces the original
- Zero External Dependencies: Pure Zig implementation with no external libraries required
Data Limits
- Arbitrary Binary Data: Store any byte sequence as keys and values
- Maximum Key Size: 1KB per key
- Maximum Value Size: 2KB per value (half page size)
- 64-bit Page Identifiers: On-disk structures use 64-bit page IDs
Current Constraints
- Multi-Page Inserts: Root and internal-node split propagation now allow the tree to grow beyond one page
- Multi-Page Deletes: Deletes work while each touched non-root leaf keeps at least one entry; deletes that would require borrow/merge/root-shrink currently return
NodeEmpty - Leaf Update Repacking: Repeated overwrites now repack the target leaf so updates do not keep leaking dead payload bytes
- Recovery Wiring:
Database.openreplays committed WAL records and clears the WAL after successful recovery - Simplified Rollback:
abort()reloads the database instead of doing targeted undo
Architecture
Storage Architecture
+---------------------------------------------------+
| Database File |
+---------------------------------------------------+
| Page 0 (Metadata) | Page 1+ (B-tree Pages) |
| - Magic number | - Node headers |
| - Version | - Key/value data |
| - Root page ID | - Child page references |
| - Reserved fields | |
+---------------------------------------------------+
The metadata format now tracks both the freelist head and the highest page ID grown from the file, so freed pages can be reused before the database extends on disk.
B-Tree Structure
The on-disk page format is now wired for multi-page growth during insert, while delete remains intentionally staged until rebalancing lands.
- Leaf Nodes: Store key-value pairs directly
- Internal Nodes: Multi-page insert/search/iteration now route through separator keys and child pointers
- Current Root Capacity: The original root still starts as a leaf page, then promotes in place to an internal root on overflow
- Sorted Keys: Keys are maintained in sorted order for binary search inside each node
- Space Management: Key/value data grows upward from the end of the page
- Delete Limitation: Deletes that would empty a non-root leaf are rejected with
NodeEmptyuntil borrow/merge/root-shrink support is implemented
Each B-tree node layout:
[Node Header (4 bytes)]
[KeyInfo array (64 * 8 bytes)]
[Key/Value data (grows upward from end)]
Write-Ahead Log (WAL)
The WAL currently records write operations and stores enough information for validation and future recovery work:
[WAL Record Header (11 bytes)]
- checksum: u32 (CRC32 of record data)
- record_type: u8 (insert/delete/commit/abort)
- key_len: u16 (length of key)
- value_len: u32 (length of value)
[Key data (variable)]
[Value data (variable, optional)]
Record Types:
insert: Log a key-value insertiondelete: Log a key deletioncommit: Mark transaction as committedabort: Mark transaction as rolled back
Automatic startup recovery now replays committed WAL records during Database.open; uncommitted and aborted WAL entries are discarded. A truncated tail header is ignored as an incomplete final record, while checksum mismatches and partial record payloads fail recovery.
Page Cache
The pager maintains an in-memory cache of recently accessed pages:
- Pages are loaded from disk on first access
- Modified pages marked as "dirty" and flushed on commit
- Cache reduces disk I/O for frequently accessed data
Download
Pre-compiled binaries are available on the Releases page.
Pre-built Binaries
| Platform | Architecture | Download |
|---|---|---|
| Linux | x86_64 | kvdb-latest-x86_64-linux.tar.gz |
| Linux | ARM64 | kvdb-latest-aarch64-linux.tar.gz |
| macOS | x86_64 | kvdb-latest-x86_64-macos.tar.gz |
| macOS | Apple Silicon | kvdb-latest-aarch64-macos.tar.gz |
| Windows | x86_64 | kvdb-latest-x86_64-windows.zip |
Quick Install
Linux/macOS:
# Download latest release curl -LO https://github.com/lispking/kvdb/releases/latest/download/kvdb-latest-x86_64-linux.tar.gz # Extract tar xzvf kvdb-latest-x86_64-linux.tar.gz cd kvdb-latest-x86_64-linux # Run ./kvdb-cli --help
Windows (PowerShell):
# Download latest release Invoke-WebRequest -Uri "https://github.com/lispking/kvdb/releases/latest/download/kvdb-latest-x86_64-windows.zip" -OutFile "kvdb.zip" # Extract Expand-Archive -Path "kvdb.zip" -DestinationPath "." cd kvdb-latest-x86_64-windows # Run .\kvdb-cli.exe --help
Quick Start
Prerequisites
If building from source:
- Zig 0.15.2 or later
- Unix-like system (macOS, Linux) or Windows
Installation
# Clone the repository git clone https://github.com/lispking/kvdb.git cd kvdb # Install git hooks (optional, for development) zig build install-hooks # Build the library and CLI tool zig build
Basic Usage
const std = @import("std"); const kvdb = @import("kvdb"); pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}).init; defer _ = gpa.deinit(); const allocator = gpa.allocator(); // Open database (creates if not exists) var db = try kvdb.open(allocator, "mydb.db"); defer db.close(); // Store data try db.put("name", "Alice"); try db.put("age", "30"); // Retrieve data if (try db.get("name")) |value| { defer allocator.free(value); std.debug.print("name = {s}\n", .{value}); } // Delete data try db.delete("age"); }
API Usage
Opening and Closing
// Open with default options var db = try kvdb.open(allocator, "path/to/db"); defer db.close(); // Open with custom options var db = try Database.open(allocator, "path/to/db", .{ .enable_wal = true, .fsync_policy = .always, });
Basic Operations
// Insert or update a key-value pair try db.put("key", "value"); // Retrieve a value (returns null if not found) if (try db.get("key")) |value| { defer allocator.free(value); // Caller must free // use value... } // Check if key exists const exists = try db.contains("key"); // Delete a key try db.delete("key");
Current Write Behavior:
put()anddelete()do not requirebeginTransaction()- writes update the current database handle immediately in memory
- durability happens later when dirty pages are flushed, such as during
commit()orclose() - reads through the same handle observe the updated in-memory state immediately
Transactions
// Begin transaction const txn = try db.beginTransaction(); try { // Perform operations try db.put("key1", "value1"); try db.put("key2", "value2"); // Commit changes try txn.commit(); } catch |_| { // Abort on error try txn.abort(); }
Current Transaction Semantics:
- Only one active transaction at a time per database handle
beginTransaction()does not create snapshots or isolation; it only establishes an explicit commit/abort boundary for the current handleput()anddelete()still modify the current handle's in-memory pages immediatelycommit()appends a commit marker, flushes the handle's dirty pages, and clears the WALabort()appends an abort marker, clears the WAL, and reloads the database file, discarding the handle's unflushed in-memory changes- startup recovery treats WAL records as logical batches:
insert/deleterecords stay pending untilcommit,abortdiscards the pending batch, and trailing incomplete batches are ignored - automatic startup recovery replays committed WAL records during
Database.open - a truncated final WAL header is treated as an incomplete tail, while checksum mismatches and partial record payloads fail recovery
- multi-page deletes succeed only when no touched non-root leaf becomes empty; deletes that would require borrow/merge/root-shrink currently fail with
NodeEmpty
Recommendation: For predictable behavior in the current engine, group related writes inside an explicit transaction and avoid mixing earlier unflushed writes with a later abort().
Iteration
var iter = try db.iterator(); var count: usize = 0; while (iter.next()) |entry| { std.debug.print("{s} = {s}\n", .{ entry.key, entry.value }); count += 1; }
Note: Current implementation iterates in sorted order across all reachable leaf pages.
Statistics
const stats = db.stats(); std.debug.print("Pages: {d}\n", .{stats.page_count}); std.debug.print("Page size: {d} bytes\n", .{stats.page_size}); std.debug.print("Total size: {d} bytes\n", .{stats.db_size});
CLI Tool
The kvdb-cli tool provides command-line access to the database.
Usage
kvdb-cli <database-file> <command> [args...]
Commands
| Command | Description | Example |
|---|---|---|
get <key> |
Get value by key | kvdb-cli my.db get name |
put <key> <value> |
Set key-value pair | kvdb-cli my.db put name Bob |
delete <key> |
Delete a key | kvdb-cli my.db delete name |
list |
List all entries | kvdb-cli my.db list |
stats |
Show database statistics | kvdb-cli my.db stats |
inspect |
Show metadata and structural tree summary | kvdb-cli my.db inspect |
export <file> |
Stream all logical entries into a portable binary dump | kvdb-cli my.db export backup.kvdbx |
import <file> |
Load entries from a portable binary dump in one transaction | kvdb-cli my.db import backup.kvdbx |
compact |
Rewrite live data into a compacted database file | kvdb-cli my.db compact |
verify |
Validate metadata, tree ordering, freelist, and WAL records | kvdb-cli my.db verify |
Examples
# Create database and insert data kvdb-cli my.db put language Zig kvdb-cli my.db put version "0.15.0" # Query data kvdb-cli my.db get language # Output: Zig # List all entries kvdb-cli my.db list # Output: # language = Zig # version = 0.15.0 # # Total: 2 entries # Check statistics kvdb-cli my.db stats # Output: # Database Statistics: # Pages: 2 # Page Size: 4096 bytes # Database Size: 8192 bytes (0.01 MB) # Verify on-disk structure kvdb-cli my.db verify # Output: # Verification OK # Tree pages checked: 1 # Entries checked: 2 # WAL records checked: 0 # Inspect metadata and tree shape kvdb-cli my.db inspect # Output: # Database # Pages: 2 # Page Size: 4096 bytes # Database Size: 8192 bytes (0.01 MB) # Metadata # Root Page ID: 1 # Freelist Head: 18446744073709551615 # Freelist Pages: 0 # Last Page ID: 1 # B-Tree # Height: 1 # Nodes: 1 # Leaf Nodes: 1 # Internal Nodes: 0 # Entries: 2 # Export logical contents to a binary-safe dump file kvdb-cli my.db export backup.kvdbx # Output: # Exported 2 entries # Import the dump into a fresh database kvdb-cli restored.db import backup.kvdbx # Output: # Imported 2 entries
Building and Testing
Build Commands
# Build library and CLI tool zig build # Build in release mode zig build -Doptimize=ReleaseFast # Build specific target zig build -Dtarget=x86_64-linux-gnu
Testing
# Run all tests zig build test # Run with verbose output zig test src/kvdb.zig
Benchmark Suite
# Run the benchmark suite zig build bench # Build the benchmark binary without running it zig build ./zig-out/bin/benchmark
The current benchmark binary reports these deterministic workloads under both fsync_policy = .always and fsync_policy = .batch:
- sequential inserts
- random inserts
- point lookups
- scans
- updates
- deletes
- compaction
Each run uses fixed operation counts and a fixed PRNG seed so results are comparable across changes. Comparing the paired policy rows shows the durability/latency tradeoff without changing workload shape.
FFI Example
A minimal C example lives at examples/c_api_example.c and uses the exported C API directly.
A minimal Python ctypes example lives at examples/python_ctypes_example.py.
# Build the static library zig build # Compile the C example against the installed archive zig cc examples/c_api_example.c zig-out/lib/libkvdb.a -o c_api_example # Run it ./c_api_example # Run the Python ctypes example python3 examples/python_ctypes_example.py
The example calls kvdb_open, kvdb_put, kvdb_get, kvdb_free, kvdb_close, and kvdb_status_code.
When kvdb_get returns a non-null pointer, the caller owns that buffer and must
release it with kvdb_free.
Mutating C API calls return stable numeric status codes instead of a generic -1:
KVDB_STATUS_OK= 0KVDB_STATUS_INVALID_ARGUMENT= 1KVDB_STATUS_NOT_FOUND= 2KVDB_STATUS_TRANSACTION_CONFLICT= 3KVDB_STATUS_STORAGE_ERROR= 4KVDB_STATUS_WAL_ERROR= 5KVDB_STATUS_INTERNAL_ERROR= 255
Use kvdb_status_code(...) from C to compare against the exported enum values without hard-coding numbers in callers.
Storage Test Matrix
| Area | Core invariants | Current coverage | Missing high-risk coverage |
|---|---|---|---|
constants.zig |
Metadata layout stays stable and fits within one page | constants: metadata layout |
Format-compatibility checks beyond struct size |
pager.zig |
Page allocation, freelist reuse, and persisted page count survive reopen | pager: basic operations |
Dirty-page edge cases, invalid page handling, deeper reclaim integration |
btree.zig |
Root leaf inserts, lookups, deletes, and sorted order remain correct | btree: basic operations |
Duplicate update/delete flows, NodeFull, key/value size boundaries, multi-page search |
wal.zig |
WAL records append and decode correctly with record-type boundaries | wal: basic operations, wal: checksum mismatch is corruption, wal: truncated value payload is corruption |
Invalid record type handling, broader replay-path coverage |
kvdb.zig |
End-to-end CRUD, reopen persistence, compaction validation, and explicit commit path stay correct | kvdb: basic operations, kvdb: transaction commit, kvdb: replay committed wal on open, kvdb: ignore uncommitted wal on open, kvdb: ignore aborted wal on open, kvdb: replay delete on open, kvdb: replay mixed wal batches on open, kvdb: replay is idempotent across reopen, kvdb: ignore truncated wal tail on open, kvdb: checksum corruption fails recovery on open, kvdb: compact preserves live key-value pairs |
Abort reload behavior, broader compaction correctness, WAL-disabled mode |
Storage Test Checklist
Use this checklist when changing storage code:
- verify restart persistence for any path that changes on-disk state
- cover duplicate insert/update/delete flows when touching write logic
- cover key/value boundary sizes and invalid arguments when touching validation
- cover
NodeFullbehavior when touching B-tree insertion or page layout - cover checksum failure and truncated-tail handling when touching WAL parsing or recovery
- compare live data before/after compaction or reopen when touching file replacement, freelist reuse, or reload paths
- name tests with a subsystem prefix such as
test "kvdb: ..."ortest "wal: ..."
Development Setup
# Install pre-commit hooks zig build install-hooks # Format code zig fmt src/ examples/ # Check formatting (read-only) zig fmt --check src/ examples/
Pre-commit Hook
The pre-commit hook automatically:
- Checks code formatting
- Builds the project
- Runs all tests
If any check fails, the commit is blocked. To bypass (not recommended):
git commit --no-verify -m "message"Creating Releases
To create a new release with pre-built binaries:
# Tag a new version git tag -a v1.0.0 -m "Release version 1.0.0" # Push the tag (triggers release workflow) git push origin v1.0.0
The Release workflow will automatically:
- Build binaries for all supported platforms (Linux x86_64/ARM64, macOS x86_64/Apple Silicon, Windows x86_64)
- Create a GitHub Release with downloadable archives
- Generate release notes with installation instructions
Release Artifacts:
| File | Description |
|---|---|
kvdb-vX.Y.Z-x86_64-linux.tar.gz |
Linux x86_64 binary |
kvdb-vX.Y.Z-aarch64-linux.tar.gz |
Linux ARM64 binary |
kvdb-vX.Y.Z-x86_64-macos.tar.gz |
macOS Intel binary |
kvdb-vX.Y.Z-aarch64-macos.tar.gz |
macOS Apple Silicon binary |
kvdb-vX.Y.Z-x86_64-windows.zip |
Windows x86_64 binary |
Project Structure
kvdb/
├── build.zig # Build configuration
├── build.zig.zon # Package manifest
├── README.md # This file
├── .gitignore # Git ignore patterns
├── scripts/
│ └── pre-commit # Git pre-commit hook
├── src/
│ ├── kvdb.zig # Main database API (public interface)
│ ├── constants.zig # Error codes, constants, data structures
│ ├── pager.zig # Page management and I/O
│ ├── btree.zig # B-tree index implementation
│ ├── wal.zig # Write-ahead logging
│ └── cli.zig # Command-line interface
└── examples/
└── basic.zig # Usage example
Module Descriptions
| Module | Description | Key Types |
|---|---|---|
kvdb.zig |
Public API and database management | Database, Transaction, Options |
constants.zig |
Constants and error definitions | Error, PageId, MetaData, WalRecordType |
pager.zig |
Page-level storage operations | Pager, Page, CacheEntry |
btree.zig |
B-tree index operations | BTree, BTreeNode, Iterator |
wal.zig |
Write-ahead logging for durability | Wal, Record, Iterator |
cli.zig |
Command-line tool | CLI argument parsing and commands |
Configuration
Database Options
pub const Options = struct { /// Enable Write-Ahead Logging record writing and startup recovery. enable_wal: bool = true, /// Choose whether commit/checkpoint paths force data to stable storage. fsync_policy: FsyncPolicy = .always, };
enable_wal controls whether KVDB writes WAL records and performs startup replay at all.
fsync_policy controls how aggressively KVDB asks the OS to persist durability boundaries:
.always: callfile.sync()at WAL commit/checkpoint and page-flush boundaries for the strongest current durability behavior.batch: keep the same write ordering but skip explicit sync calls, which can improve throughput while allowing recent writes to remain in OS buffers after a crash or power loss
Note: WAL changes replay semantics only when enabled; fsync_policy changes durability cost, not logical recovery rules.
Constants
| Constant | Value | Description |
|---|---|---|
PAGE_SIZE |
4096 | Size of each database page |
MAX_KEY_SIZE |
1024 | Maximum key size in bytes |
MAX_VALUE_SIZE |
2048 | Maximum value size in bytes |
MAX_KEYS |
64 | Maximum entries per B-tree node |
DB_VERSION |
1 | Database format version |
MAGIC |
0x4B5644425F5A4947 | File format identifier ("KVDB_ZIG") |
Performance Considerations
Optimization Tips
- Batch Operations: Group multiple puts in a single transaction
- Reuse Connections: Keep database open for multiple operations
- Page Size: The current engine uses fixed 4KB pages internally
- Disable WAL: Only disable WAL if you intentionally want to skip WAL record logging
- Tune fsync policy: Use
.alwaysfor the strongest current durability behavior and.batchwhen benchmark throughput matters more than surviving the most recent buffered writes
Known Limitations
- Inserts and deletes currently operate on a single root leaf page
- Iteration is limited to the current root page
- Automatic WAL replay runs during
Database.open - Rollback is implemented via full reload, not targeted undo
- No compression support yet
- No encryption support
- Single-writer model (no concurrent transactions)
Error Handling
The database uses a comprehensive error set:
pub const Error = error{ // Storage errors DiskFull, CorruptedData, InvalidPageId, PageNotFound, PageOverflow, // Transaction errors TransactionAlreadyActive, NoActiveTransaction, TransactionConflict, // B-tree errors KeyNotFound, KeyAlreadyExists, NodeFull, NodeEmpty, // WAL errors WalCorrupted, WalReplayFailed, // I/O errors IoError, InvalidArgument, DatabaseClosed, };
All operations return Error!T and should be handled with try or catch.
License
MIT License - See LICENSE file for details
Contributing
Contributions are welcome! Please ensure:
- Code follows Zig style guidelines (
zig fmt) - All tests pass (
zig build test) - New features include tests
- Documentation is updated
Acknowledgments
- Inspired by SQLite's architecture and B-tree implementation
- Built with Zig 0.15.2
- Thanks to the Zig community for feedback and support