DocumentationNeurondB Documentation

Ensemble Reranking

Overview

Combine multiple reranking strategies for best results.

Ensemble Reranking

Combine multiple reranking models for improved accuracy. NeuronDB supports weighted ensemble and Borda count methods.

Weighted Ensemble

Combine multiple reranking models with custom weights. Each model's scores are weighted and summed to produce final scores.

Weighted ensemble reranking

-- Prepare scored results from multiple models
CREATE TABLE model1_scores (
    id INT,
    score REAL
);

CREATE TABLE model2_scores (
    id INT,
    score REAL
);

CREATE TABLE model3_scores (
    id INT,
    score REAL
);

-- Insert scores from different models
INSERT INTO model1_scores (id, score) VALUES
    (1, 0.95), (2, 0.90), (3, 0.85), (4, 0.60), (7, 0.70);

INSERT INTO model2_scores (id, score) VALUES
    (1, 0.80), (2, 0.70), (4, 0.95), (6, 0.75), (7, 0.85);

INSERT INTO model3_scores (id, score) VALUES
    (1, 0.88), (3, 0.82), (4, 0.90), (7, 0.92), (8, 0.65);

-- Weighted ensemble with equal weights
SELECT 
    d.id,
    d.content,
    e.final_score
FROM neurondb.rerank_ensemble_weighted(
    ARRAY['model1_scores', 'model2_scores', 'model3_scores']::text[],
    ARRAY[1.0, 1.0, 1.0]::real[],  -- Equal weights
    'id',                           -- ID column name
    'score'                         -- Score column name
) e
JOIN documents d ON d.id = e.id
ORDER BY e.final_score DESC;

-- Weighted ensemble with custom weights (prioritize model 1)
SELECT 
    d.id,
    d.content,
    e.final_score
FROM neurondb.rerank_ensemble_weighted(
    ARRAY['model1_scores', 'model2_scores', 'model3_scores']::text[],
    ARRAY[2.0, 1.0, 1.0]::real[],  -- Model 1 has 2x weight
    'id',
    'score'
) e
JOIN documents d ON d.id = e.id
ORDER BY e.final_score DESC;

Function Signature:

neurondb.rerank_ensemble_weighted( score_tables TEXT[],    -- Array of table names with scores weights REAL[],         -- Array of weights (one per table) id_column TEXT,         -- ID column name in score tables score_column TEXT       -- Score column name in score tables ) RETURNS TABLE ( id INTEGER,            -- Document ID final_score REAL        -- Weighted combined score )

Borda Count Ensemble

Use Borda count voting to combine rankings. Each model votes for documents based on rank, and votes are summed.

Borda count ensemble

-- Borda count ensemble reranking
SELECT 
    d.id,
    d.content,
    e.borda_score
FROM neurondb.rerank_ensemble_borda(
    ARRAY['model1_scores', 'model2_scores', 'model3_scores']::text[],
    'id',                   -- ID column
    'score'                 -- Score column (used for ranking)
) e
JOIN documents d ON d.id = e.id
ORDER BY e.borda_score DESC;

Function Signature:

neurondb.rerank_ensemble_borda( score_tables TEXT[], id_column TEXT, score_column TEXT ) RETURNS TABLE ( id INTEGER, borda_score REAL       -- Borda count score (higher = better) )

How Borda Count Works:

  • Each model ranks documents by score (highest score = rank 1)
  • Documents receive points based on rank: rank 1 gets N points, rank 2 gets N-1 points, etc.
  • Points from all models are summed to get final Borda score
  • Higher Borda score = better overall ranking across all models

MMR (Maximal Marginal Relevance) Reranking

Balance relevance and diversity using MMR. Higher lambda values prioritize relevance, lower values prioritize diversity.

MMR reranking

-- MMR reranking with scores
SELECT 
    id,
    content,
    score
FROM neurondb.mmr_rerank_with_scores(
    'documents',            -- table name
    'embedding',            -- vector column
    '[0.1, 0.2, 0.3]'::vector,  -- query vector
    5,                      -- top_k
    0.7                     -- lambda: 0.7 = more relevance, 0.3 = more diversity
)
ORDER BY score DESC;

-- MMR without scores (just IDs)
SELECT id, content
FROM neurondb.mmr_rerank(
    'documents',
    'embedding',
    '[0.1, 0.2, 0.3]'::vector,
    5,
    0.7
);

Function Signatures:

neurondb.mmr_rerank( table_name TEXT, vector_column TEXT, query_vector VECTOR, top_k INTEGER, lambda REAL            -- 0.0-1.0: 1.0 = pure relevance, 0.0 = pure diversity ) RETURNS TABLE (id INTEGER) neurondb.mmr_rerank_with_scores( table_name TEXT, vector_column TEXT, query_vector VECTOR, top_k INTEGER, lambda REAL ) RETURNS TABLE (id INTEGER, score REAL)

RRF (Reciprocal Rank Fusion)

Combine multiple ranking lists using Reciprocal Rank Fusion. RRF is robust to outliers and works well with heterogeneous ranking sources.

Reciprocal Rank Fusion

-- Create ranking lists from different sources
CREATE TABLE semantic_ranking (
    id INT,
    rank INT
);

CREATE TABLE keyword_ranking (
    id INT,
    rank INT
);

-- Insert rankings
INSERT INTO semantic_ranking (id, rank) VALUES
    (1, 1), (2, 2), (3, 3), (7, 4), (4, 5);

INSERT INTO keyword_ranking (id, rank) VALUES
    (4, 1), (1, 2), (7, 3), (6, 4), (2, 5);

-- RRF fusion
SELECT 
    d.id,
    d.content,
    rrf.score
FROM neurondb.reciprocal_rank_fusion(
    ARRAY['semantic_ranking', 'keyword_ranking']::text[],
    'id',                   -- ID column name
    'rank',                 -- Rank column name
    60                      -- k parameter (typically 60)
) rrf
JOIN documents d ON d.id = rrf.id
ORDER BY rrf.score DESC;

Function Signature:

neurondb.reciprocal_rank_fusion( ranking_tables TEXT[],  -- Array of table names with rankings id_column TEXT,         -- ID column name rank_column TEXT,       -- Rank column name k INTEGER               -- RRF constant (typically 60) ) RETURNS TABLE ( id INTEGER, score REAL              -- RRF score (higher = better) )

RRF Formula: score = Σ(1 / (k + rank)) across all ranking lists. The k parameter (typically 60) prevents division by very small numbers.

Learn More

For detailed documentation on ensemble strategies, weight optimization, and combining rerankers, visit: Ensemble Reranking Documentation

Related Topics