# EverOS — Full Reference > EverOS is the Memory Operating System for Agentic AI. It gives LLM agents persistent, structured memory that extracts knowledge from conversations and multimodal data, resolves contradictions, and retrieves context intelligently — so agents remember, learn, and evolve across sessions. For a concise overview, see llms.txt in the same directory. Docs: https://docs.evermind.ai Dashboard & API keys: https://everos.evermind.ai GitHub (open source): https://github.com/EverMind-AI/EverOS Cloud Python SDK: pip install everos-cloud OSS Python package: pip install everos API base URL: https://api.evermind.ai Auth: Bearer token in Authorization header Contact: contact@evermind.ai | Discord: https://discord.gg/geHdX4F24B Research paper: https://arxiv.org/pdf/2601.02163 For the TL;DR, recommended agent loop, and "when to use" guidance, see llms.txt. ## Table of Contents - Core Concepts - Memory Lifecycle - MemCell - MemScene - Memory Types - Reconstructive Recollection - API Reference (v1) - Authentication - Memories: Add, Flush, Get, Search, Delete - Groups - Senders - Tasks - Storage (Multimodal Upload) - Settings - Filters DSL - Error Handling - Retrieval Methods - Keyword, Vector, Hybrid, Agentic - Choosing a Method - Agentic Retrieval Deep Dive - Multimodal Memory - Foresight Memory - Python SDK - Sync Client - Async Client - Group Memories - Agent Memories - Cookbook Patterns - Personal Assistant - Team Collaboration - Customer Support - AI Tutor - Batch Processing - Open Source Deployment - EverOS vs Standard RAG - Benchmarks & FAQ --- ## Core Concepts ### Memory Lifecycle EverOS reimagines memory as a dynamic, living lifecycle inspired by biological engram formation: 1. **Episodic Trace Formation (Encoding)** — The system monitors dialogue streams and uses semantic boundary detection to segment interactions into coherent events. Instead of storing raw logs, it creates discrete MemCells — like remembering a "dinner party" as a distinct event rather than a second-by-second transcript. 2. **Semantic Consolidation (Storage)** — In the background, the system analyzes new MemCells, links them to existing knowledge, updates the User Profile, and clusters related memories into MemScenes. This transforms transient episodes into stable, long-term wisdom. It resolves contradictions (e.g., "My dog is 3" vs "My dog turned 4") and merges redundancies. 3. **Reconstructive Recollection (Retrieval)** — When the agent needs context, it doesn't just keyword-search. It identifies which MemScene is relevant, traverses connections to find specific MemCells, and reconstructs the exact context needed — filtering noise and prioritizing relevance. ### MemCell The atomic unit of memory. A structured tuple: M = (Episode, Atomic Facts, Foresight, Metadata). - **Episode (E)**: Narrative summary of what happened — captures flow, user intent, and causal logic. - **Atomic Facts (A)**: Discrete, verifiable statements (e.g., "User likes spicy food", "Budget approval needed by Friday"). - **Foresight (F)**: Forward-looking inferences with validity intervals (e.g., "User will be in Paris" valid from tomorrow to next week). - **Metadata (T)**: Timestamps, source, confidence, emotional valence. Example MemCell structure: ```json { "memcell_id": "mc_123456789", "episode": "The user discussed plans for the Q3 marketing campaign, emphasizing social media channels.", "atomic_facts": [ "User role is Marketing Manager", "Q3 campaign focus is Social Media", "Budget approval needed by Friday" ], "foresight": { "prediction": "User will submit budget proposal", "valid_after": "2025-10-10T09:00:00Z", "valid_until": "2025-10-13T17:00:00Z" }, "metadata": { "created_at": "2025-10-09T14:30:00Z", "source": "slack_integration" } } ``` MemCells are created automatically through: segmentation (boundary detection) -> extraction (LLM parses narrative + facts) -> inference (future implications) -> packaging (unique ID assigned). ### MemScene A thematic cluster of related MemCells representing a specific context (e.g., "Python Project X", "Job Interview Prep", "Personal Hobbies"). MemScenes are created and maintained through Semantic Consolidation, which runs asynchronously. The consolidation process: 1. **Clustering** — New MemCells are matched to existing MemScenes by semantic embedding, or a new cluster is created for novel topics. 2. **Synthesis** — Within a MemScene, redundancies are merged and contradictions resolved. 3. **User Persona Update** — Key traits are promoted to the global User Profile. MemScenes solve the context window problem by providing pre-digested, structured views of topics. Instead of reading 500 pages of chat logs, the agent reads the synthesized MemScene. ### Memory Types | Type | API value | Description | Supported scenes | |------|-----------|-------------|-----------------| | Episode | `episodic_memory` | Narrative summaries of conversations capturing flow and decisions | All | | Profile | `profile` | Persistent user attributes, preferences, and traits | All | | Foresight | `foresight` | Time-bounded prospective memories (reminders, deadlines) | Single-user only (not available in group_chat) | | EventLog | `eventlog` | Atomic factual event records with timestamps | Single-user only (not available in group_chat) | | Agent Case | `agent_case` | Task intent, step-by-step approach, and quality score from agent trajectories | All | | Agent Skill | `agent_skill` | Generalized skills distilled from multiple agent cases | All | | Agent Memory | `agent_memory` | Agent cases + skills combined — what the agent has learned about itself (search `memory_types` filter only) | All | Mental model for choosing: - "Who am I?" -> Profile - "What happened last week?" -> Episode - "What was the exact file name?" -> EventLog - "What's next?" -> Foresight - "What has the agent learned?" -> Agent Memory (search only) ### Reconstructive Recollection Traditional search is passive (query -> matching documents). Reconstructive Recollection is active: the agent analyzes current intent and rebuilds the necessary context. The process: 1. **Intent Analysis** — Determines what the user wants (task, topic, time constraints). 2. **Scene Activation** — Loads the relevant MemScene. 3. **Context Synthesis** — Pulls specific facts, decisions, and action items while ignoring irrelevant content (small talk, off-topic). This is adaptive: - **Factual queries** ("What is my API key?") -> precise Atomic Facts lookup - **Creative tasks** ("Brainstorm ideas") -> broader episodic narratives - **Reasoning tasks** ("Why did we switch databases?") -> causal chain across multiple MemCells --- ## API Reference (v1) Base URL: `https://api.evermind.ai` All endpoints use JSON. The v0 API is deprecated — use v1 only. ### Authentication ``` Authorization: Bearer ``` Obtain keys from https://everos.evermind.ai/api-keys. Never commit keys to version control. ### Add Personal Memories `POST /api/v1/memories` Adds messages for a single user. Messages are queued for async processing by default. **Request body:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `user_id` | string | yes | Owner user ID | | `session_id` | string | no | Session identifier | | `messages` | array | yes | 1-500 message items | | `async_mode` | boolean | no | Default true. False for synchronous processing | **Message item:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `role` | string | yes | "user" or "assistant" | | `timestamp` | integer | yes | Unix milliseconds | | `content` | string or array | yes | Text string, or array of ContentItems for multimodal | ```python from everos_cloud import EverOS import time client = EverOS() memories = client.v1.memories response = memories.add( user_id="user_001", session_id="session_001", messages=[ {"role": "user", "timestamp": int(time.time() * 1000), "content": "I prefer morning meetings before 10am."}, ], ) print(f"status={response.data.status} task_id={response.data.task_id}") ``` Response (async mode, 202): ```json {"data": {"task_id": "abc123", "status": "queued", "message_count": 1, "message": "Message accepted and queued for processing"}} ``` Response (sync mode, 200): ```json {"data": {"task_id": "", "status": "accumulated", "message_count": 1, "message": "Messages accepted"}} ``` ### Add Group Memories `POST /api/v1/memories/group` Adds messages for a multi-participant group. Each message must include `sender_id`. **Request body:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `group_id` | string | yes | Group identifier | | `group_meta` | object | no | Group metadata | | `messages` | array | yes | 1-500 group message items | | `async_mode` | boolean | no | Default true | **Group message item** (extends message item): | Field | Type | Required | Description | |-------|------|----------|-------------| | `sender_id` | string | yes | Sender identifier | | `sender_name` | string | no | Display name | | `message_id` | string | no | Unique message ID | ```python group_mem = client.v1.memories.group group_mem.add( group_id="team_standup", messages=[ {"role": "user", "sender_id": "alice", "sender_name": "Alice", "timestamp": 1711900000000, "content": "Let's discuss the sprint."}, {"role": "user", "sender_id": "bob", "sender_name": "Bob", "timestamp": 1711900060000, "content": "I finished the API refactor."}, ], ) ``` ### Add Agent Memories `POST /api/v1/memories/agent` Adds agent trajectory messages. Supports roles: "user", "assistant", "tool". **Request body:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `user_id` | string | yes | Owner user ID | | `session_id` | string | no | Session identifier | | `messages` | array | yes | 1-500 agent message items | | `async_mode` | boolean | no | Default true | **Agent message item:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `role` | string | yes | "user", "assistant", or "tool" | | `timestamp` | integer | yes | Unix milliseconds | | `content` | string/array/null | conditional | Content (not required for assistant with tool_calls) | | `tool_calls` | array | no | Tool calls made by assistant (OpenAI format) | | `tool_call_id` | string | conditional | Required when role is "tool" | ```python agent = client.v1.memories.agent agent.add( user_id="user_001", session_id="coding_session_001", messages=[ {"role": "user", "timestamp": 1711900000000, "content": "Find all Python files with TODO comments"}, {"role": "assistant", "timestamp": 1711900001000, "tool_calls": [ {"id": "call_1", "type": "function", "function": {"name": "grep", "arguments": "{\"pattern\": \"TODO\", \"glob\": \"**/*.py\"}"}} ]}, {"role": "tool", "timestamp": 1711900002000, "tool_call_id": "call_1", "content": "Found 3 files with TODO comments..."}, {"role": "assistant", "timestamp": 1711900003000, "content": "I found 3 Python files with TODO comments."}, ], ) ``` ### Flush Memories Triggers boundary detection on accumulated messages. If a boundary is detected, memory extraction runs immediately. | Endpoint | Required params | |----------|----------------| | `POST /api/v1/memories/flush` | `user_id`, optional `session_id` | | `POST /api/v1/memories/group/flush` | `group_id` | | `POST /api/v1/memories/agent/flush` | `user_id`, optional `session_id` | ```python memories.flush(user_id="user_001", session_id="session_001") ``` Response: ```json {"data": {"status": "extracted", "message": "Flush completed"}} ``` Status values: `"extracted"` (extraction triggered) or `"no_extraction"` (no boundary detected). ### Get Memories `POST /api/v1/memories/get` Retrieves structured memories with filters and pagination. **Request body:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `memory_type` | string | yes | `episodic_memory`, `profile`, `agent_case`, or `agent_skill` | | `filters` | object | yes | Must contain `user_id` or `group_id` (see Filters DSL) | | `page` | integer | no | Page number, starts at 1 (default 1) | | `page_size` | integer | no | Items per page, 1-100 (default 20) | | `rank_by` | string | no | Sort field (default "timestamp") | | `rank_order` | string | no | "asc" or "desc" (default "desc") | ```python response = memories.get( filters={"user_id": "user_001"}, memory_type="episodic_memory", page=1, page_size=10, ) episodes = response.data.episodes # list of EpisodeItem total = response.data.total_count ``` **Response fields by memory_type:** `episodic_memory` returns `episodes[]` with: `id`, `user_id`, `group_id`, `session_id`, `timestamp`, `participants`, `sender_ids`, `summary`, `subject`, `episode`, `type`, `parent_type`, `parent_id`. `profile` returns `profiles[]` with: `id`, `user_id`, `group_id`, `profile_data` (contains `explicit_info` and `implicit_traits`), `scenario`, `memcell_count`. `agent_case` returns `agent_cases[]` with: `id`, `user_id`, `session_id`, `task_intent`, `approach`, `quality_score` (0.0-1.0), `timestamp`. `agent_skill` returns `agent_skills[]` with: `id`, `user_id`, `cluster_id`, `name`, `description`, `content`, `confidence` (0.0-1.0), `maturity_score` (0.0-1.0), `source_case_ids[]`. All responses include `total_count` and `count`. ### Search Memories `POST /api/v1/memories/search` Searches memories using semantic, keyword, or hybrid retrieval. **Recommended defaults:** `method="hybrid"`, `top_k=5` for chat contexts or `top_k=10` for research/analysis. Always include `user_id` in filters unless querying group-level data. **Request body:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `query` | string | yes | Search query text | | `filters` | object | yes | Always include `user_id` or `group_id` | | `method` | string | no | `keyword`, `vector`, `hybrid` (default, recommended), or `agentic` | | `memory_types` | array | no | Default `["episodic_memory", "profile"]`. Options: `episodic_memory`, `profile`, `raw_message`, `agent_memory` (agent cases + skills; what the agent has learned about itself) | | `top_k` | integer | no | Max results. Recommended: **5** for chat contexts, **10** for research/analysis. Pass `-1` to let the server apply an automatic distance cutoff (returns up to 100) | | `radius` | float | no | Cosine similarity threshold 0.0-1.0 for vector methods | | `include_original_data` | boolean | no | Return original data (default false) | ```python response = memories.search( filters={"user_id": "user_001"}, query="coffee preference", method="hybrid", memory_types=["episodic_memory", "profile"], top_k=10, ) ``` ### Delete Memories `POST /api/v1/memories/delete` Two mutually exclusive modes: **Single delete** — provide `memory_id` only: ```json {"memory_id": "67c8a1b2f3e4d5c6a7b8c9d0"} ``` **Batch delete** — provide at least one of `user_id` or `group_id`: ```json {"user_id": "user_001", "session_id": "session_001"} ``` Filter values use three-state logic: `"__all__"` (skip/ignore), `null` or `""` (match empty), or a string value (exact match). Default for all filters is `"__all__"`. ### Groups | Endpoint | Description | |----------|-------------| | `POST /api/v1/groups` | Create group (upsert). Required: `group_id`. Optional: `name`, `description` | | `GET /api/v1/groups/{group_id}` | Get group details | | `PATCH /api/v1/groups/{group_id}` | Update group. At least one of `name` or `description` required | ```python groups = client.v1.groups groups.create(group_id="team_eng", name="Engineering", description="Engineering team channel") ``` ### Senders | Endpoint | Description | |----------|-------------| | `POST /api/v1/senders` | Create sender (upsert). Required: `sender_id`. Optional: `name`, `role`, `metadata` | | `GET /api/v1/senders/{sender_id}` | Get sender details | | `PATCH /api/v1/senders/{sender_id}` | Update sender display name | ```python senders = client.v1.senders senders.create(sender_id="user_alice", name="Alice") ``` ### Tasks `GET /api/v1/tasks/{task_id}` Check async task status. Task status stored in Redis with 1-hour TTL. Response: ```json {"data": {"task_id": "abc123", "status": "success"}} ``` Status values: `"processing"`, `"success"`, `"failed"`. ### Storage (Multimodal Upload) `POST /api/v1/object/sign` Generate pre-signed S3 upload URLs. Upload signature valid for 15 minutes, download valid for 7 days. ```json { "objectList": [ {"fileId": "img_001", "fileName": "whiteboard.png", "fileType": "image"}, {"fileId": "doc_001", "fileName": "report.pdf", "fileType": "file"}, {"fileId": "vid_001", "fileName": "recording.mp4", "fileType": "video"} ] } ``` Response returns `objectKey` and `objectSignedInfo` (url + fields) for each file. Upload flow: 1. Call `/api/v1/object/sign` to get pre-signed URL and `objectKey` 2. POST the file to the S3 URL with the returned fields (returns 204) 3. Use `objectKey` as `uri` in message content array ### Settings | Endpoint | Description | |----------|-------------| | `GET /api/v1/settings` | Get current memory space settings | | `PUT /api/v1/settings` | Update settings (LLM providers, extraction behavior). Only provided fields are updated | ### Filters DSL Used in get and search endpoints. **Always include `user_id`** unless you're querying group-level data (then use `group_id`). Omitting both will return a 422 error. **Supported fields and operators:** | Field | Operators | Notes | |-------|-----------|-------| | `user_id` | eq, in | Top-level, conditionally required | | `group_id` | eq, in | Top-level, conditionally required | | `session_id` | eq, in, gt, gte, lt, lte | Inside AND/OR combinators | | `timestamp` | eq, gt, gte, lt, lte | Accepts epoch ms/s or ISO string | **Operator syntax:** Plain value = eq. Object for other operators. Examples: ```json // Simple user filter {"user_id": "user_001"} // User with time range {"user_id": "user_001", "AND": [{"timestamp": {"gte": 1700000000000, "lt": 1710000000000}}]} // Multiple sessions {"user_id": "user_001", "AND": [{"session_id": {"in": ["s1", "s2"]}}]} // OR combinator {"user_id": "user_001", "OR": [{"session_id": "s1"}, {"session_id": "s2"}]} ``` ### Error Handling Errors return: ```json {"code": "InvalidParameter", "message": "user_id: Field required", "request_id": "unknown", "timestamp": "2026-03-24T00:00:00+00:00", "path": "/api/v1/memories"} ``` | HTTP Status | Meaning | |-------------|---------| | 400 | Syntax error (malformed JSON, missing required fields, body too large) | | 422 | Validation error (invalid values, business rule violations) | | 500 | Internal server error | Error codes: `InvalidParameter`, `NotFound`, `InternalError`. --- ## Retrieval Methods ### Keyword (BM25) Fast lexical search. <100ms latency. Best for exact terms, known phrases, product names, IDs. ```python memories.search(filters={"user_id": "u1"}, query="Project Phoenix", method="keyword", top_k=10) ``` ### Vector (Semantic) Embedding-based search. 200-500ms. Finds conceptually similar content even with different wording. ```python memories.search(filters={"user_id": "u1"}, query="what are the user's hobbies", method="vector", top_k=10) ``` ### Hybrid / RRF (Recommended Default) Combines keyword + vector using Reciprocal Rank Fusion, then reranks. 200-600ms. Best balance of precision and recall. ```python memories.search(filters={"user_id": "u1"}, query="morning routine and work preferences", method="hybrid", top_k=10) ``` ### Agentic (LLM-Guided) Uses an LLM to decompose complex queries into sub-queries, execute each via hybrid search, and aggregate results. 2-5s latency, 1-3 LLM calls, 3-5 search operations. ```python memories.search( filters={"user_id": "u1"}, query="What context would help me prepare for discussing the product roadmap with stakeholders?", method="agentic", memory_types=["episodic_memory", "profile"], top_k=15, ) ``` **Decision rule — use `agentic` ONLY if:** - The query is complex or multi-step AND - `hybrid` results are insufficient for the task **Always fallback to `hybrid` on timeout or error.** Do not default to agentic — it costs 3-5x more and is 10x slower. **Best practices for agentic:** - Set longer timeouts (60s) — default HTTP timeouts will fail - Implement hybrid fallback: try agentic, catch timeout, retry with hybrid - Write detailed queries explaining what context you need (not just keywords) - Filter `memory_types` to only what's needed to reduce search space - Use `group_id` in filters to narrow scope when possible ### Choosing a Retrieval Method | Scenario | Method | |----------|--------| | Real-time autocomplete | `keyword` | | Chatbot memory retrieval | `hybrid` | | Specific terms/IDs | `keyword` | | "Tell me everything about X" | `agentic` | | Semantic similarity | `vector` | | Default / unsure | `hybrid` | Decision flowchart: ``` Latency critical (<100ms)? -> keyword Complex multi-part query? -> agentic Otherwise -> hybrid ``` --- ## Multimodal Memory Messages can include multimodal content by using an array of ContentItems instead of a plain string. **Supported types:** | Type | Formats | Max size | |------|---------|----------| | `image` | JPG, PNG, GIF, WebP | 10 MB | | `doc` | DOC, TXT | 100 MB | | `pdf` | PDF | 100 MB | | `html` | HTML | 100 MB | | `email` | Email | 100 MB | | `audio` | MP3, WAV | 500 MB | | `video` | MP4, WebM | 500 MB | **ContentItem fields:** | Field | Type | Description | |-------|------|-------------| | `type` | string | Required. One of: text, image, audio, doc, pdf, html, email | | `text` | string | Content body (for type: text) | | `uri` | string | objectKey from upload flow | | `name` | string | File name | | `ext` | string | File extension (png, mp3, pdf) | | `source` | string | Content source (google_doc, notion, confluence, zoom) | | `source_info` | object | Source-related traceability metadata | | `extras` | object | Type-specific extra fields | **Limits:** 10 non-text items per message, 300KB request body (files stored on S3). **Upload via API (3 steps):** ```bash # 1. Get pre-signed URL curl -X POST https://api.evermind.ai/api/v1/object/sign \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"objectList": [{"fileId": "img_001", "fileName": "whiteboard.png", "fileType": "image"}]}' # Response includes objectKey and objectSignedInfo with url + fields # 2. Upload file to S3 using returned fields (returns 204) # 3. Use objectKey in message content curl -X POST https://api.evermind.ai/api/v1/memories \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{ "user_id": "user_001", "messages": [{ "role": "user", "timestamp": 1711900000000, "content": [ {"type": "text", "text": "Here is the meeting whiteboard"}, {"type": "image", "uri": "", "name": "whiteboard.png", "ext": "png"} ] }] }' ``` **Upload via Python SDK (single step):** The SDK auto-handles signing, uploading, and memory creation. Pass local paths or HTTP URLs directly: ```python import time from everos_cloud import EverOS client = EverOS() memories = client.v1.memories now_ms = int(time.time() * 1000) response = memories.add( user_id="user_001", messages=[{ "role": "user", "timestamp": now_ms, "content": [ {"type": "text", "text": "Meeting whiteboard photo"}, {"type": "image", "uri": "./whiteboard.jpg", "name": "whiteboard.jpg", "ext": "jpg"}, ], }], ) ``` Works with all add endpoints: `/api/v1/memories`, `/api/v1/memories/group`, `/api/v1/memories/agent`. --- ## Foresight Memory Time-bounded prospective memories. Only available in single-user conversations — not extracted from group_chat memory spaces. When conversations mention future events, EverOS extracts foresight memories with `start_time` and `end_time` validity windows: ``` User: "Remind me to call John next Tuesday at 2pm" -> Foresight: "Call John" (start: Tue 2pm, end: Tue 3pm) User: "Submit report by Friday" -> Foresight: "Submit report" (start: now, end: Fri 11:59pm) ``` **Searching foresight memories:** Include `current_time` to filter by validity window: ```python from datetime import datetime response = memories.search( filters={"user_id": "user_001"}, query="reminders appointments deadlines", method="hybrid", memory_types=["foresight"], top_k=10, current_time=datetime.now().isoformat() + "Z", ) ``` Foresight enables proactive behavior: contextual reminders, conflict detection, deadline tracking, and time-aware responses. --- ## Python SDK ### Installation ```bash pip install everos-cloud ``` Set your API key via environment variable or constructor: ```bash export EVEROS_API_KEY="your_api_key" ``` ### Sync Client ```python from everos_cloud import EverOS import time client = EverOS() # uses EVEROS_API_KEY env var # or: client = EverOS(api_key="your_api_key") memories = client.v1.memories # Add memories.add( user_id="user_001", messages=[{"role": "user", "timestamp": int(time.time() * 1000), "content": "I prefer morning meetings."}], ) # Flush memories.flush(user_id="user_001") # Search (hybrid is the recommended default method, top_k=5 for chat, 10 for analysis) results = memories.search(filters={"user_id": "user_001"}, query="meeting preferences", method="hybrid", top_k=5) # Get profile = memories.get(filters={"user_id": "user_001"}, memory_type="profile") ``` ### Async Client ```python import asyncio import time from everos_cloud import AsyncEverOS async def main(): client = AsyncEverOS() memories = client.v1.memories # Concurrent adds now_ms = int(time.time() * 1000) tasks = [ memories.add(user_id="user_001", messages=[{"role": "user", "timestamp": now_ms + i, "content": f"Message {i}"}]) for i in range(5) ] await asyncio.gather(*tasks) # Search results = await memories.search(filters={"user_id": "user_001"}, query="preferences", method="hybrid", top_k=5) asyncio.run(main()) ``` ### Group Memories via SDK ```python groups = client.v1.groups senders = client.v1.senders group_mem = client.v1.memories.group # Register group and senders groups.create(group_id="team_eng", name="Engineering Team") senders.create(sender_id="user_alice", name="Alice") senders.create(sender_id="user_bob", name="Bob") # Add group messages group_mem.add( group_id="team_eng", messages=[ {"role": "user", "sender_id": "user_alice", "sender_name": "Alice", "timestamp": 1711900000000, "content": "Let's use PostgreSQL for the new service."}, {"role": "user", "sender_id": "user_bob", "sender_name": "Bob", "timestamp": 1711900060000, "content": "Agreed. I'll prepare the schema by Friday."}, ], ) # Flush and search group_mem.flush(group_id="team_eng") # Search group discussions results = memories.search(filters={"group_id": "team_eng"}, query="database decision", method="hybrid", top_k=5) # Search individual perspective results = memories.search(filters={"user_id": "user_bob"}, query="database decision", method="hybrid", top_k=5) ``` ### Agent Memories via SDK ```python agent = client.v1.memories.agent agent.add( user_id="agent_001", session_id="task_session_001", messages=[ {"role": "user", "timestamp": 1711900000000, "content": "Deploy the staging environment"}, {"role": "assistant", "timestamp": 1711900001000, "tool_calls": [ {"id": "call_1", "type": "function", "function": {"name": "run_deploy", "arguments": "{\"env\": \"staging\"}"}} ]}, {"role": "tool", "timestamp": 1711900010000, "tool_call_id": "call_1", "content": "Deployment successful. URL: https://staging.example.com"}, {"role": "assistant", "timestamp": 1711900011000, "content": "Staging environment deployed successfully."}, ], ) agent.flush(user_id="agent_001", session_id="task_session_001") # Retrieve learned cases and skills cases = memories.get(filters={"user_id": "agent_001"}, memory_type="agent_case") skills = memories.get(filters={"user_id": "agent_001"}, memory_type="agent_skill") ``` --- ## Cookbook Patterns ### Personal Assistant Store every conversation turn, retrieve context before generating responses, use context to personalize LLM output. ```python from everos_cloud import EverOS import time client = EverOS() memories = client.v1.memories def chat(user_id: str, user_message: str) -> str: # 1. Retrieve relevant memories context = memories.search( filters={"user_id": user_id}, query=user_message, method="hybrid", memory_types=["episodic_memory", "profile"], top_k=5, ) # 2. Build prompt with memory context memory_text = "\n".join( getattr(ep, "episode", "") or getattr(ep, "summary", "") for ep in (context.data.episodes or []) ) prompt = f"User memories:\n{memory_text}\n\nUser: {user_message}" # 3. Generate response with your LLM response = call_your_llm(prompt) # your LLM call here # 4. Store the exchange now_ms = int(time.time() * 1000) memories.add( user_id=user_id, messages=[ {"role": "user", "timestamp": now_ms, "content": user_message}, {"role": "assistant", "timestamp": now_ms + 1000, "content": response}, ], ) return response ``` ### Team Collaboration Use group memories with sender attribution. EverOS generates both group-level summaries and per-participant episodes. ```python from everos_cloud import EverOS import time client = EverOS() groups = client.v1.groups senders = client.v1.senders group_mem = client.v1.memories.group # Setup groups.create(group_id="team_standup", name="Daily Standup") for sid, name in [("alice", "Alice"), ("bob", "Bob"), ("carol", "Carol")]: senders.create(sender_id=sid, name=name) # Store meeting messages now_ms = int(time.time() * 1000) group_mem.add( group_id="team_standup", messages=[ {"role": "user", "sender_id": "alice", "sender_name": "Alice", "timestamp": now_ms, "content": "I finished the auth module yesterday. Today I'll work on the dashboard."}, {"role": "user", "sender_id": "bob", "sender_name": "Bob", "timestamp": now_ms + 30000, "content": "I'm blocked on the API integration. Need Alice's auth endpoints first."}, {"role": "user", "sender_id": "carol", "sender_name": "Carol", "timestamp": now_ms + 60000, "content": "Design review is tomorrow at 2pm. Everyone please review the mockups."}, ], ) group_mem.flush(group_id="team_standup") # Search: what the team discussed team_context = client.v1.memories.search( filters={"group_id": "team_standup"}, query="blockers and action items", method="hybrid", top_k=10, ) # Search: what Bob specifically needs bob_context = client.v1.memories.search( filters={"user_id": "bob"}, query="blockers", method="hybrid", top_k=5, ) ``` ### Customer Support Per-customer memory with session isolation per ticket. Cross-ticket search for historical context. ```python from everos_cloud import EverOS import time client = EverOS() memories = client.v1.memories class SupportBot: def create_ticket(self, customer_id: str, ticket_id: str, subject: str): session_id = f"ticket_{ticket_id}" memories.add( user_id=customer_id, session_id=session_id, messages=[{"role": "assistant", "timestamp": int(time.time() * 1000), "content": f"Ticket opened: {subject}"}], ) return session_id def handle_message(self, customer_id: str, session_id: str, message: str) -> str: # Search across ALL customer tickets for relevant history history = memories.search( filters={"user_id": customer_id}, query=message, method="hybrid", memory_types=["episodic_memory", "profile"], top_k=10, ) # Generate response using history context + your LLM response = call_your_llm(message, history) # Store the exchange now_ms = int(time.time() * 1000) memories.add( user_id=customer_id, session_id=session_id, messages=[ {"role": "user", "timestamp": now_ms, "content": message}, {"role": "assistant", "timestamp": now_ms + 1000, "content": response}, ], ) return response ``` ### AI Tutor Track student progress across sessions. Use episodic memory for quiz results and profile memory for learning style. ```python from everos_cloud import EverOS import time client = EverOS() memories = client.v1.memories class AITutor: def __init__(self, subject: str): self.subject = subject def record_quiz(self, student_id: str, topic: str, score: int, total: int): pct = (score / total) * 100 now_ms = int(time.time() * 1000) memories.add( user_id=student_id, messages=[ {"role": "user", "timestamp": now_ms, "content": f"I finished the {topic} quiz."}, {"role": "assistant", "timestamp": now_ms + 1000, "content": f"Quiz on {topic}: {score}/{total} ({pct:.0f}%). {'Needs review.' if pct < 80 else 'Well done!'}"}, ], ) def get_knowledge_gaps(self, student_id: str) -> list: resp = memories.search( filters={"user_id": student_id}, query="struggled difficult needs review low score", method="vector", memory_types=["episodic_memory"], top_k=20, ) return resp.data.episodes if resp.data else [] def get_learning_context(self, student_id: str, topic: str) -> dict: profile = memories.search(filters={"user_id": student_id}, query=f"learning style {topic}", method="vector", memory_types=["profile"], top_k=5) progress = memories.search(filters={"user_id": student_id}, query=f"{topic} quiz score", method="vector", memory_types=["episodic_memory"], top_k=5) return {"profile": profile.data, "progress": progress.data} ``` ### Batch Processing Import existing conversation history at scale. Key considerations: 500 messages per request, timestamps in unix milliseconds, use async mode (default). ```python from everos_cloud import EverOS import json import time client = EverOS() memories = client.v1.memories group_mem = client.v1.memories.group def batch_import_personal(user_id: str, conversations: list, batch_size: int = 500): """Import personal conversation history.""" for i in range(0, len(conversations), batch_size): batch = conversations[i:i + batch_size] response = memories.add(user_id=user_id, messages=batch) print(f"Batch {i // batch_size + 1}: {response.data.status}") time.sleep(1) # rate limiting # Flush to trigger extraction memories.flush(user_id=user_id) def batch_import_group(group_id: str, messages: list, batch_size: int = 500): """Import group conversation history.""" for i in range(0, len(messages), batch_size): batch = messages[i:i + batch_size] response = group_mem.add(group_id=group_id, messages=batch) print(f"Batch {i // batch_size + 1}: {response.data.status}") time.sleep(1) group_mem.flush(group_id=group_id) ``` **Data format for personal:** ```json [ {"role": "user", "timestamp": 1705312800000, "content": "Hello, I need help with my project."}, {"role": "assistant", "timestamp": 1705312860000, "content": "Sure! What project are you working on?"} ] ``` **Data format for group:** ```json [ {"role": "user", "sender_id": "alice", "sender_name": "Alice", "timestamp": 1705312800000, "content": "Let's discuss the architecture."}, {"role": "user", "sender_id": "bob", "sender_name": "Bob", "timestamp": 1705312890000, "content": "I think we should use microservices."} ] ``` --- ## Open Source Deployment EverOS OSS is a self-hosted, md-first memory framework. Storage stack: Markdown (truth) + SQLite (state) + LanceDB (index). No external services required. **Repository:** https://github.com/EverMind-AI/EverOS **Prerequisites:** Python 3.12+, an OpenRouter API key (chat LLM + multimodal), a DeepInfra API key (embedding + rerank). **Setup:** ```bash pip install everos # or: uv pip install everos everos init # writes ./.env with config template # Edit .env — fill in your API keys: # EVEROS_LLM__API_KEY (OpenRouter) # EVEROS_MULTIMODAL__API_KEY (OpenRouter — same key) # EVEROS_EMBEDDING__API_KEY (DeepInfra) # EVEROS_RERANK__API_KEY (DeepInfra — same key) everos server start # starts on 127.0.0.1:8000 ``` **Verify:** ```bash curl http://127.0.0.1:8000/health # {"status": "ok"} ``` **OSS API endpoints** (HTTP only, no SDK): | Operation | Method & Path | Key params | |-----------|--------------|------------| | Add messages | `POST /api/v1/memory/add` | `session_id`, `app_id`, `project_id`, `messages[]` | | Force extraction | `POST /api/v1/memory/flush` | `session_id`, `app_id`, `project_id` | | Search memories | `POST /api/v1/memory/search` | `user_id` or `agent_id`, `query`, `method`, `top_k` | | Get memories | `POST /api/v1/memory/get` | `user_id` or `agent_id`, `memory_type`, `page`, `page_size` | | Health check | `GET /health` | — | | Metrics | `GET /metrics` | — | **`app_id` / `project_id`:** Both default to `"default"`. They partition memory at `~/.everos///users//...`. Queries never cross scopes. Valid chars: `a-z A-Z 0-9 _ . -`, 1–128 chars; `"."` and `".."` are rejected. **OSS Quickstart (HTTP):** OSS and Cloud have different API schemas — not just path prefixes. Key differences: - `sender_id` (not `user_id`) is the required field in each message; it becomes the index key - `user_id` / `agent_id` are top-level params on search/get, not inside a filters object - Path prefix: OSS `/api/v1/memory/` (singular) vs Cloud `/api/v1/memories/` (plural) See the [OSS API Reference](https://docs.evermind.ai/open-source/api-reference) for the full schema. ```bash # 1. Add messages curl -X POST http://127.0.0.1:8000/api/v1/memory/add \ -H 'Content-Type: application/json' \ -d '{ "session_id": "session_001", "app_id": "default", "project_id": "default", "messages": [ {"sender_id": "user_001", "role": "user", "timestamp": 1700000000000, "content": "I like black coffee, no sugar."}, {"sender_id": "assistant", "role": "assistant", "timestamp": 1700000001000, "content": "Got it, noted."} ] }' # {"request_id": "...", "data": {"message_count": 2, "status": "accumulated"}} # 2. Flush (force extraction) curl -X POST http://127.0.0.1:8000/api/v1/memory/flush \ -H 'Content-Type: application/json' \ -d '{"session_id": "session_001", "app_id": "default", "project_id": "default"}' # {"request_id": "...", "data": {"status": "extracted"}} # 3. Search — user_id is top-level, not inside filters curl -X POST http://127.0.0.1:8000/api/v1/memory/search \ -H 'Content-Type: application/json' \ -d '{ "user_id": "user_001", "app_id": "default", "project_id": "default", "query": "coffee preference", "method": "hybrid", "top_k": 5 }' # Response: {"data": {"episodes": [...], "profiles": [], "agent_cases": [], "agent_skills": [], "unprocessed_messages": []}} # episodes[].episode — full narrative → use as LLM context # episodes[].summary — short summary (~200 chars) ``` **Memory is stored as plain Markdown files on disk:** ``` ~/.everos/ ├── default_app/default_project/ │ └── users// │ ├── episodes/ ← episodic summaries │ ├── atomic_facts/ ← extracted facts │ ├── foresights/ ← time-bounded predictions │ └── user.md ← user profile └── .index/ ← SQLite + LanceDB (rebuildable from md) ``` The open-source version has the same core memory pipeline as Cloud. Cloud adds managed infrastructure, auto-scaling, dashboard, and memory visibility. ### Multimodal Support (OSS) Multimodal parsing is an optional add-on — not included in the base `everos` install: ```bash pip install 'everos[multimodal]' # For office documents (doc, docx, ppt, pptx, xls, xlsx): also install LibreOffice brew install --cask libreoffice # macOS sudo apt-get install -y libreoffice # Debian/Ubuntu ``` The multimodal LLM is configured independently from `[llm]` — `everos init` writes the template into `.env`: ```bash EVEROS_MULTIMODAL__MODEL=google/gemini-3-flash-preview EVEROS_MULTIMODAL__API_KEY= EVEROS_MULTIMODAL__BASE_URL=https://openrouter.ai/api/v1 ``` **No upload step.** Pass assets directly in the message `content` array: | Field | Value | |-------|-------| | `uri` | `http(s)://` URL fetched server-side, or `file://` path on the server filesystem | | `base64` | Inline bytes, plain base64 (no `data:` prefix); include `ext` hint | ```json {"type": "image", "uri": "https://example.com/whiteboard.png"} {"type": "pdf", "base64": "JVBERi0xLjQK...", "ext": "pdf", "name": "report.pdf"} ``` Parsed text flows into the same extraction pipeline as plain text — search works identically across text and multimodal memories. --- ## EverOS vs Standard RAG | Feature | Standard RAG | EverOS | |---------|-------------|--------| | Storage | Document chunks | Temporal facts & relationships (MemCells, MemScenes) | | Updates | Manual, static, append-only | LLM-driven automatic with conflict resolution | | Query | Vector similarity only | BM25, vector, hybrid (RRF), and agentic | | Scope | Document-level | User/session-level with attribution | | Multi-user | Flat text (profile contamination risk) | Isolated per-participant extraction | **RAG limitations for long-term interaction:** 1. **Low signal-to-noise** — Top-k retrieval surfaces similar but useless chat segments (small talk, transitions). 2. **State conflicts** — Append-only logs accumulate contradictions. The LLM must arbitrate at runtime. 3. **Multi-user attribution failure** — Flat text sequence causes misattribution between speakers. **Use EverOS for:** AI companions, personalized assistants, multi-user group chats, educational/coaching agents, gaming NPCs — anything requiring long-term consistency and user understanding. **Use RAG for:** Static document QA, "chat with PDF", FAQ bots, code search — scenarios where the source document is the ground truth. --- ## Benchmarks - **LoCoMo**: 93.05% overall accuracy (with GPT-4.1-mini), vs Zep 85.22%. Multi-hop: 91.84%, Temporal: 89.72%. - **LongMemEval**: 83.00% overall, vs MemOS 77.80%. - **PersonaMem v2**: Profile consolidation improved accuracy by 9%+. ## Quota Subscription plans based on MemCell count. Average ratio: ~10 raw messages produce 1 MemCell (varies by semantic boundary detection). ## FAQ **What problem does EverOS solve?** LLMs hit a "cognitive wall" — limited context windows can't hold months of history, and standard retrieval pulls isolated snippets without integration or conflict handling. EverOS provides structured memory organization. **How does it differ from Mem0, Zep, MemOS?** EverOS implements a complete biological memory lifecycle (Formation -> Consolidation -> Recollection) rather than flat storage + fragment retrieval. It actively transforms dialogues into structured knowledge and dynamic profiles. **What scenarios is it best for?** Long-term AI companions, personalized health/lifestyle management, professional collaboration, educational agents — any application requiring consistency across time and deep user understanding. **How does it handle temporal reasoning?** Foresight memories have validity intervals. Prospection Filtering during retrieval retains only currently valid information, enabling precise temporal reasoning. ## SDK Migration (v0 to v1) The `everos-sdk-upgrade` plugin automates migration from Python SDK v0 (`evermemos`) to v1 (`everos_cloud`), covering 13 rules: dependency updates, env vars, imports, client init, API signatures, type renames, and exception classes. ```bash # Claude Code /plugin marketplace add EverMind-AI/everos-plugins /plugin install everos-sdk-upgrade@everos-plugins /everos-sdk-upgrade # Other AI tools (Cursor, Copilot, Codex, etc.) npx skills add https://github.com/EverMind-AI/everos-plugins ``` Repository: https://github.com/EverMind-AI/everos-plugins