Performance

Sub-millisecond vector searches, 100x GPU acceleration, SIMD-optimized operations, and intelligent caching for production-scale AI workloads.

Performance Benchmarks

Test Environment: AWS r6i.2xlarge (8 vCPU, 64GB RAM), 10M vectors, 768 dimensions

Operation	Throughput	Latency (p95)	Notes
Vector Insert	50K/sec	2ms	Bulk COPY
HNSW Search (k=10)	10K QPS	5ms	ef_search=40
Embedding Generation	1K/sec	10ms	Batch size 32
Hybrid Search	5K QPS	8ms	Vector+FTS
Reranking	2K/sec	15ms	Cross-encoder
GPU K-Means	55K vectors/sec	18ms	10 clusters

Optimization Techniques

1. SIMD Acceleration

Automatic SIMD (Single Instruction Multiple Data) optimization for distance calculations using AVX2, AVX-512 (x86) or NEON (ARM).

4-8x

AVX2 Speedup

8-16x

AVX-512 Speedup

Auto

Detection

2. Intelligent Caching

Embedding Cache: 95%+ hit rate, 50x faster than generation
Model Cache: Models loaded in shared memory, 99.8% hit rate
ANN Buffer: Hot centroids and entry points cached
Index Page Cache: 92%+ hit rate for frequently accessed vectors

3. Query Planning

Intelligent cost-based query planning chooses optimal execution paths:

• Small result sets → Sequential scan
• Medium result sets → IVF index
• Large result sets → HNSW index
• GPU available + large batch → GPU acceleration
• Hybrid query → Parallel vector + FTS execution

Best Practices

1. Index Selection

Dataset Size	Recommended Index	Parameters
< 100K vectors	HNSW	m=16, ef=200
100K - 10M vectors	HNSW or IVF	m=32, ef=400 or nlist=sqrt(n)
> 10M vectors	IVF + PQ	nlist=4000, PQ compression

2. Use Batch Operations

-- Good: Batch embedding generation (5x faster)
UPDATE docs SET embedding = batch.emb
FROM (
  SELECT id, unnest(embed_text_batch(array_agg(content))) AS emb
  FROM docs GROUP BY id % 100
) batch WHERE docs.id = batch.id;

-- Bad: Individual calls
UPDATE docs SET embedding = embed_text(content);  -- Slow!

3. Monitor Cache Hit Rates

SELECT * FROM neurondb_cache_stats();

-- Target hit rates:
--   Embeddings: > 50%
--   Models: > 95%
--   Index pages: > 90%