Embedding Engine
Overview
The Embedding Engine provides multi-modal embedding generation capabilities, transforming text, images, and mixed data into dense vector representations using state-of-the-art transformer models. With support for OpenAI, Cohere, HuggingFace, and custom models, the Embedding Engine enables semantic search, similarity matching, and AI-powered applications directly in PostgreSQL.
Key Features
- Multi-Modal Support: Text, images, audio, and mixed data embeddings
- Multiple Providers: OpenAI, Cohere, HuggingFace, and custom models
- Automatic Caching: Intelligent caching to reduce API calls and latency
- Batch Processing: Efficient batch generation for high-throughput scenarios
- GPU Acceleration: Automatic GPU offload for transformer models
- Model Management: Version control, A/B testing, and model switching
What Are Embeddings?
Embeddings are dense vector representations that capture semantic meaning in a high-dimensional space. Unlike traditional keyword-based representations, embeddings encode contextual relationships, enabling machines to understand similarity and meaning across different data types.
Why Use Embeddings?
- Semantic Understanding: Find conceptually similar content, not just exact matches
- Language Independence: Similar concepts in different languages have similar embeddings
- Cross-Modal Search: Search images with text, or text with images
- Context Awareness: Understand meaning based on surrounding context
Text Embeddings
Generate embeddings from text using various transformer models optimized for different use cases, languages, and quality requirements.
Basic Usage
Generate text embedding
-- Generate embedding for a single text
SELECT embed_text('Machine learning with PostgreSQL', 'text-embedding-ada-002');
-- Use in similarity search
SELECT
id,
content,
embedding <=> embed_text('PostgreSQL vector search', 'text-embedding-ada-002') AS distance
FROM documents
ORDER BY distance
LIMIT 10;Batch Generation
Generate embeddings for multiple texts efficiently with automatic batching and parallel processing.
Batch text embeddings
-- Generate embeddings for multiple texts
SELECT embed_text_batch(
ARRAY[
'First document text',
'Second document text',
'Third document text'
],
'text-embedding-ada-002'
) AS embeddings;
-- Bulk insert with embeddings
INSERT INTO documents (content, embedding)
SELECT
content,
embed_text(content, 'text-embedding-ada-002')
FROM source_documents
WHERE embedding IS NULL;Supported Text Models
| Provider | Model | Dimensions | Best For |
|---|---|---|---|
| OpenAI | text-embedding-ada-002 | 1536 | General purpose, production-ready |
| OpenAI | text-embedding-3-small | 1536 | Cost-effective, high quality |
| OpenAI | text-embedding-3-large | 3072 | Maximum quality, larger vectors |
| Cohere | embed-english-v3.0 | 1024 | English text, high quality |
| Cohere | embed-multilingual-v3.0 | 1024 | 100+ languages |
| HuggingFace | all-MiniLM-L6-v2 | 384 | Fast, lightweight, self-hosted |
| HuggingFace | all-mpnet-base-v2 | 768 | High quality, self-hosted |
| HuggingFace | paraphrase-multilingual-MiniLM | 384 | 50+ languages, self-hosted |
Image Embeddings
Generate embeddings from images using CLIP (Contrastive Language-Image Pre-training) models, enabling cross-modal search between images and text.
Basic Usage
Generate image embedding
-- Generate embedding from image file
SELECT embed_image('/path/to/image.jpg', 'CLIP-ViT-B-32');
-- Generate from image URL
SELECT embed_image_url('https://example.com/image.jpg', 'CLIP-ViT-B-32');
-- Generate from base64 encoded image
SELECT embed_image_base64(base64_data, 'CLIP-ViT-B-32')
FROM image_data;Image Search
Search images by visual similarity or using text descriptions.
Image similarity search
-- Find similar images
SELECT
id,
image_path,
image_embedding <=> query_embedding AS distance
FROM images,
(SELECT embed_image('/path/to/query.jpg', 'CLIP-ViT-B-32') AS query_embedding) q
ORDER BY distance
LIMIT 10;
-- Search images with text
SELECT
id,
image_path,
image_embedding <=> embed_text('a red sports car', 'CLIP-ViT-B-32') AS distance
FROM images
ORDER BY distance
LIMIT 10;Supported Image Models
| Model | Dimensions | Best For |
|---|---|---|
| CLIP-ViT-B-32 | 512 | General purpose, fast |
| CLIP-ViT-L-14 | 768 | High quality, detailed images |
| CLIP-ViT-B-16 | 512 | Balanced quality and speed |
Multi-Modal Embeddings
Combine text and image embeddings in the same vector space, enabling cross-modal search and unified semantic understanding across different data types.
Cross-Modal Search
Cross-modal search
-- Search images with text
SELECT
id,
image_path,
image_embedding <=> embed_text('a sunset over mountains', 'CLIP-ViT-B-32') AS similarity
FROM images
ORDER BY similarity
LIMIT 10;
-- Search text with images
SELECT
id,
content,
text_embedding <=> embed_image('/path/to/query.jpg', 'CLIP-ViT-B-32') AS similarity
FROM documents
ORDER BY similarity
LIMIT 10;Unified Embedding Space
Store text and image embeddings in the same table and search across both modalities simultaneously.
Unified search
-- Create unified content table
CREATE TABLE content (
id SERIAL PRIMARY KEY,
type TEXT, -- 'text' or 'image'
content TEXT, -- text content or image path
embedding vector(512) -- CLIP embeddings
);
-- Insert text and images
INSERT INTO content (type, content, embedding)
SELECT
'text',
content,
embed_text(content, 'CLIP-ViT-B-32')
FROM text_documents;
INSERT INTO content (type, content, embedding)
SELECT
'image',
image_path,
embed_image(image_path, 'CLIP-ViT-B-32')
FROM images;
-- Search across all content types
SELECT
type,
content,
embedding <=> embed_text('nature photography', 'CLIP-ViT-B-32') AS similarity
FROM content
ORDER BY similarity
LIMIT 20;Supported Models
OpenAI Models
- text-embedding-ada-002: General purpose, 1536 dimensions, production-ready
- text-embedding-3-small: Cost-effective, 1536 dimensions
- text-embedding-3-large: Maximum quality, 3072 dimensions
Cohere Models
- embed-english-v3.0: High-quality English embeddings, 1024 dimensions
- embed-multilingual-v3.0: 100+ languages, 1024 dimensions
HuggingFace Models
- all-MiniLM-L6-v2: Fast, lightweight, 384 dimensions
- all-mpnet-base-v2: High quality, 768 dimensions
- paraphrase-multilingual-MiniLM: 50+ languages, 384 dimensions
CLIP Models
- CLIP-ViT-B-32: General purpose, 512 dimensions
- CLIP-ViT-L-14: High quality, 768 dimensions
Custom Models
Deploy custom transformer models in ONNX format for specialized use cases.
Deploy custom model
-- Deploy custom ONNX model
SELECT deploy_embedding_model(
model_name => 'custom_text_encoder',
model_path => '/path/to/model.onnx',
input_type => 'text',
output_dim => 768
);
-- Use custom model
SELECT embed_text('sample text', 'custom_text_encoder');Caching & Performance
Automatic Caching
The Embedding Engine automatically caches embeddings to reduce API calls, latency, and costs. Identical inputs return cached results instantly.
Caching configuration
-- Configure embedding cache
SET neurondb.embedding_cache_size = 10000; -- Cache 10K embeddings
SET neurondb.embedding_cache_ttl = 86400; -- 24 hour TTL
-- Check cache statistics
SELECT * FROM neurondb_embedding_cache_stats();
-- Clear cache
SELECT clear_embedding_cache();Batch Processing
Process multiple embeddings in parallel for improved throughput and efficiency.
Batch processing
-- Batch processing with automatic parallelization
SELECT embed_text_batch(
texts,
'text-embedding-ada-002',
batch_size => 100,
parallel => true
)
FROM (
SELECT array_agg(content) AS texts
FROM documents
WHERE embedding IS NULL
) batch;Performance Metrics
- Single Embedding (API): 50-200ms (depends on provider)
- Single Embedding (Cached): < 1ms
- Batch (100 items, API): 200-500ms
- Batch (100 items, Local): 10-50ms (GPU: 2-10ms)
Optimization Tips
- Enable caching for frequently accessed content
- Use batch processing for bulk operations
- Deploy local models (HuggingFace) for low-latency requirements
- Use GPU acceleration for local transformer models
- Pre-compute embeddings during data ingestion
Use Cases
Semantic Search
Find documents, products, or content based on meaning rather than exact keywords.
Recommendation Systems
Recommend similar items, users, or content based on embedding similarity.
Image Search
Search images by visual similarity or text descriptions using CLIP embeddings.
Content Moderation
Identify similar content, detect duplicates, and flag inappropriate material.
Multilingual Search
Search across multiple languages using multilingual embedding models.
RAG Applications
Generate embeddings for retrieval augmented generation (RAG) pipelines.
Related Documentation
- Embeddings Guide - Detailed embedding documentation
- Embedding Generation - Advanced embedding techniques
- Vector Engine - Index and search embeddings
- GPU Accelerator - Accelerate embedding generation
- RAG Pipelines - Build RAG applications with embeddings