Strategies
Lightweight Fast Mode
Mechanism: Pure Keyword Search (BM25).
Use Case: Real-time typing suggestions, low-latency autocomplete.
Agentic Multi-Round
Mechanism: LLM-driven query generation + Multi-step search.
Use Case: Complex reasoning, answering “Why” questions.
The Retrieval Pipeline
- Query Analysis: The system analyzes the user’s input to determine intent.
- Query Expansion: Generates multiple search queries (keywords, vectors) to broaden coverage.
- Hybrid Search: Executes BM25 and Vector search in parallel.
- RRF Fusion: Merges results using Reciprocal Rank Fusion.
- Reranking: Scores the fused list to select the top candidates.
- Reasoning: The LLM uses the retrieved memories to formulate the final response.