Benchmarks

Benchmarked on Apple M1, single-threaded, with auto-scaled buffer pool.

Core Operations

OperationLatencyThroughputTargetStatus
Node lookup0.13 us7.9M ops/sec< 1 usPASS
Node creation0.65 us1.5M ops/sec
Edge traversal9 us111K ops/sec
Full-text search (100 docs)19 us53K ops/sec
10-NN vector search (1M vectors)0.83 ms1.2K ops/sec< 10 ms @ 1MPASS

Vector Search (HNSW) at Scale

128-dimensional cosine vectors, M=16, ef_construction=200, ef_search=64, k=10.

ScaleMean LatencyP99 LatencyRecall@10Memory
1,00065 us70 us100%1 MB
10,000174 us695 us99%10 MB
100,000438 us1.2 ms99%101 MB
1,000,000832 us1.8 ms100%1,040 MB

Search latency scales sub-linearly (O(log N)) with 99-100% recall@10. Uses heuristic neighbor selection (HNSW paper Algorithm 4) for diverse graph connectivity, connection page packing for ~4.5x memory reduction, and pre-normalized dot product for fast cosine distance.

ef_search Sensitivity (1M vectors)

ef_searchMean LatencyRecall@10
16506 us57%
321.9 ms79%
64990 us100%
1283.2 ms100%
25611.6 ms100%

Optimization History

Baseline (pre-optimization)

ScaleInsert RateSearch MeanRecall@10
1K~91/sec1.7ms100%
10K~42/sec3.8ms99%
100K~23/sec4.5ms99%
1M~14/sec6.4ms100%

Post-optimization (Phase 2)

Six optimizations applied: last_page tracking, pre-sized search structures, stack-buffer connection I/O, cached vectors in heuristic pruning, pre-normalize + dot product for cosine.

ScaleInsert RateSearch MeanRecall@10
1K~954/sec65us100%
10K~726/sec174us99%
100K~526/sec438us99%
1M~248/sec832us100%

Improvement Summary

ScaleInsert SpeedupSearch Speedup
1K10.5x26x
10K17.5x22x
100K22.8x10x
1M17.7x7.7x

Reproducing

zig build benchmark                        # Core operation benchmarks
zig build vector-benchmark -- --quick      # Vector benchmarks (1K/10K/100K, ~7 min)
zig build vector-benchmark                 # Full vector benchmarks including 1M (~70 min)
zig build graph-benchmark -- --quick       # Graph traversal benchmarks