DocumentationNeurondB Documentation

LLM Reranking

Overview

GPT/Claude-powered scoring for reranking.

LLM Reranking

Rerank search results using Large Language Models (LLMs) for semantic relevance scoring. LLM reranking provides high-quality relevance assessment but may have higher latency and cost.

Basic LLM Reranking

LLM reranking

-- LLM reranking using ndb_llm_rerank function
WITH documents AS (
    SELECT ARRAY[
        'PostgreSQL is a powerful relational database',
        'Machine learning models can be trained in SQL',
        'Vector search enables semantic similarity',
        'RAG pipelines combine retrieval and generation',
        'NeuronDB extends PostgreSQL with ML capabilities'
    ] AS docs
)
SELECT 
    idx,
    score,
    docs[idx] AS document
FROM documents,
    LATERAL ndb_llm_rerank(
        'machine learning',              -- query text
        docs,                            -- candidate documents array
        'ms-marco-MiniLM-L-6-v2',        -- model name (optional)
        5                                -- top K results
    ) AS rerank_result
ORDER BY score DESC;

Function Signature:

ndb_llm_rerank( query TEXT,              -- Query text documents TEXT[],        -- Array of candidate document texts model_name TEXT,         -- Optional model name top_k INTEGER           -- Number of top results to return ) RETURNS TABLE ( idx INTEGER,            -- Index in documents array score REAL              -- Relevance score (higher = more relevant) )

Configuration

Configure LLM providers and API keys for reranking:

LLM configuration

-- Set Hugging Face API key
SET neurondb.llm_api_key = 'your-huggingface-api-key';

-- Set Hugging Face endpoint
SET neurondb.huggingface_endpoint = 'https://api-inference.huggingface.co';

-- For OpenAI (if supported)
SET neurondb.openai_api_key = 'your-openai-api-key';

Performance Considerations

Latency: LLM reranking is slower than cross-encoders due to API calls
Cost: Each reranking call may incur API costs
Batch Processing: Consider batching multiple queries for efficiency
Fallback: Use cross-encoders as fallback if LLM is unavailable

Learn More

For detailed documentation on LLM reranking, model configuration, cost optimization, and prompt engineering, visit: LLM Reranking Documentation

LLM Reranking

Overview

LLM Reranking

Basic LLM Reranking

LLM reranking

Configuration

LLM configuration

Performance Considerations

Learn More

Related Topics