Reranker
Reranking is an optional post-retrieval step that re-scores results using a more sophisticated model, improving relevance at the cost of slightly higher latency.
How It Works
Query → Retrieve (vector + graph) → Rerank → Final ResultsWithout a reranker, results are ranked by embedding similarity and graph scores. With a reranker, a cross-encoder or LLM evaluates each result against the query for more accurate scoring.
Setup
Add a reranker to your FlowRAG config:
import { LocalReranker } from '@flowrag/provider-local';
const rag = createFlowRAG({
schema,
...createLocalStorage('./data'),
reranker: new LocalReranker(),
});The reranker is applied automatically to all query modes (hybrid, local, global, naive).
Implementations
Local (ONNX)
Runs entirely offline using a cross-encoder model:
import { LocalReranker } from '@flowrag/provider-local';
const reranker = new LocalReranker();
// Uses Xenova/ms-marco-MiniLM-L-6-v2 via ONNXPros: No API key, no network, fast, free
Cons: Less accurate than larger models
Gemini
LLM-based relevance scoring:
import { GeminiReranker } from '@flowrag/provider-gemini';
const reranker = new GeminiReranker();
// Requires GEMINI_API_KEYPros: High accuracy, understands nuance
Cons: Requires API key, higher latency, cost per query
AWS Bedrock
Amazon Rerank API:
import { BedrockReranker } from '@flowrag/provider-bedrock';
const reranker = new BedrockReranker();
// Uses amazon.rerank-v1:0Pros: Managed service, good accuracy, AWS integration
Cons: Requires AWS credentials, cost per query
When to Use a Reranker
- Use it when result quality matters more than latency
- Skip it for simple lookups or when speed is critical
- Start without and add one if results aren't relevant enough