Architecture

EverMemOS operates through a cognitive loop involving two main tracks: Memory Construction and Memory Perception.

The architecture is designed to turn raw conversation streams into structured, retrievable knowledge.

This layer is responsible for ingesting data and organizing it into meaningful units.

Ingestion

Raw messages enter the system through the API or data pipeline.

Boundary Detection

The system identifies shifts in topics or contexts to segment the conversation.

Extraction

Specialized prompts and models extract MemCells (Atomic Memory Units).

Consolidation

Integrate by theme and participants to form episodes and profiles.

Indexing

Data is stored with both keyword (lexical) and semantic (vector) indices for robust retrieval.

This layer handles how the agent “remembers” and uses information.

Intelligent Retrieval Tools:
- Hybrid Retrieval (RRF): Combines keyword (BM25) and vector search using Reciprocal Rank Fusion.
- Reranking: A specialized model reorders candidate memories to ensure relevance.
Flexible Strategies:
- Lightweight Fast Mode: Optimized keyword search for low latency.
- Agentic Multi-Round: Generates clarifying questions for complex queries.
Reasoning Fusion:
- Recalled memories are fused with the current conversation context to prevent hallucinations.