Skip to content

Conversation

@jsonMartin
Copy link
Contributor

@jsonMartin jsonMartin commented Jan 30, 2026

Summary

This PR implements RFC-004: Flat Index for Binary Vectors, adding native binary vector support to EdgeVec with a specialized BinaryFlatIndex optimized for semantic caching and insert-heavy workloads.

CleanShot 2026-01-29 at 21 46 11@2x

Key Features

  • Native binary storage: StorageType::Binary(u32) - 32x memory reduction vs f32
  • BinaryFlatIndex: O(1) insert, O(n) search with SIMD-accelerated Hamming distance
  • WASM integration: Full JavaScript/TypeScript API with IndexType.binary(dimensions)
  • Automatic quantization: f32 vectors auto-converted to binary via sign-bit quantization

Performance Characteristics

Operation Complexity Time (10K vectors)
Insert O(1) ~1 μs
Search O(n) ~1ms (SIMD)

Use Cases

  • Semantic caching (insert-heavy, exact recall required)
  • Datasets < 100K vectors
  • When insert latency is critical (~1μs vs ~2ms for HNSW)

Files Changed

  • src/flat/mod.rs - New BinaryFlatIndex implementation
  • src/storage/mod.rs - Added StorageType::Binary(u32) variant
  • src/wasm/mod.rs - WASM bindings for binary index creation and search
  • src/error.rs - Added BinaryFlatIndexError to unified error hierarchy
  • docs/rfcs/RFC_FLAT_INDEX.md - Design document

Relationship to FlatIndex (Week 40)

This PR complements the f32 FlatIndex added in Week 40 (upstream). The two serve different purposes:

Feature BinaryFlatIndex (this PR) FlatIndex (Week 40)
Storage Native binary (u8) f32 with optional BQ
Use case Semantic caching General flat search
Memory 32x reduction 4x with BQ

Both coexist via the IndexType enum.

Test plan

  • All 1019 library tests pass
  • cargo fmt --check passes
  • cargo clippy -- -D warnings passes
  • Hostile review completed - all findings addressed
  • Integration tests for binary vector operations
  • WASM tests for JS interop
Add comprehensive binary vector support including:
- BinaryFlatIndex with Hamming and Jaccard distance metrics
- Native packed binary storage (8 bits per byte)
- WASM SIMD-accelerated Hamming distance computation
- Binary quantization for f32 vectors
- Full persistence support (snapshot save/load)
- Soft delete and compaction support
- Result type API for better error handling
@jsonMartin jsonMartin force-pushed the feat/binary-vector-support branch from 9d7cc7b to af09fda Compare January 30, 2026 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant