Memory Manager
Long-term fact storage, ChromaDB semantic search, conversation summarisation, and background contradiction detection.
Overview
MemoryManager is the central memory layer. It combines:
- A SQLite facts table — structured, queryable, persistent facts extracted from conversations.
- A ChromaDB collection (
luna_facts) — embedding vectors for semantic similarity search. - A conversation history buffer — the last N messages injected as context.
One instance is created per request and shares the same SQLAlchemy session as the router.
| File | Contents |
|---|---|
memory_manager/manager.py | Full MemoryManager class — single module (tight self.db coupling). |
memory_manager/__init__.py | Re-exports MemoryManager. |
Instantiation
from backend.services.memory_manager import MemoryManager
from backend.models.database import SessionLocal
db = SessionLocal()
mm = MemoryManager(db)
# ... use mm ...
db.close()db: Session = Depends(get_db) — the session is managed by the dependency injector. Never share a MemoryManager instance across requests.store_fact()
Persists a fact to SQLite and embeds it into ChromaDB.
async def store_fact(
content: str,
category: str, # "preference" | "goal" | "behavior" | "identity" | "context" | ...
memory_type: str = "long", # "long" | "short"
importance: float = 0.7, # 0.0 – 1.0
confidence: float = 0.85,
source: str = "conversation",
expires_at: datetime | None = None,
) -> int # returns the new fact's database IDCategories
| Category | Example |
|---|---|
preference | "User prefers dark mode" |
goal | "User wants to learn Rust this year" |
behavior | "User is frequently active late at night" |
identity | "User is a backend engineer at a startup" |
context | "User is working on a FastAPI project called Luna" |
routine | "User starts work at 9 AM on weekdays" |
Usage
# Store a long-term preference
fact_id = await mm.store_fact(
"User strongly prefers Python over JavaScript",
category="preference",
importance=0.9,
)
# Store a short-lived context fact with expiry
from datetime import datetime, timedelta
await mm.store_fact(
"User is currently debugging a memory leak",
category="context",
memory_type="short",
expires_at=datetime.utcnow() + timedelta(hours=2),
)retrieve_relevant()
Returns the most semantically relevant active facts for a query using ChromaDB cosine similarity, with a SQLite text-search fallback if embeddings are unavailable.
async def retrieve_relevant(
query: str,
limit: int = 6,
) -> list[Fact] # SQLAlchemy Fact ORM objectsfacts = await mm.retrieve_relevant("what programming languages does the user know?")
for fact in facts:
print(f"[{fact.category}] {fact.content} (confidence={fact.confidence:.2f})")MEMORY_RETRIEVAL_COUNT in .env (default: 6).get_core_facts()
Returns high-importance, high-confidence facts that are always injected into the system prompt regardless of query relevance. These are the "always know" facts — things like the user's name, profession, and key goals.
def get_core_facts(limit: int = 10) -> list[Fact]core = mm.get_core_facts()
for f in core:
print(f.content)get_conversation_context()
Builds the full context string that is injected before the LLM system prompt. Combines core facts, relevant facts, and recent conversation summary.
async def get_conversation_context(
query: str,
conversation_id: int | None = None,
) -> strWhat it includes
- Core facts (always-on, high importance).
- Semantically retrieved facts for the current query.
- The most recent conversation summary (if one exists).
context = await mm.get_conversation_context(
query=user_message,
conversation_id=current_conversation_id,
)
# Prepend to system prompt
full_system = context + "\n\n" + base_system_promptcompact_facts()
Removes duplicate and contradicted facts using an LLM pass. Runs daily at 3 AM via the scheduler, but can be called manually.
async def compact_facts() -> int # returns number of facts removedremoved = await mm.compact_facts()
print(f"Removed {removed} redundant facts")Conversation summaries
After every N messages, the chat pipeline stores a compressed summary of the conversation as a memory record. This allows Luna to maintain context across very long sessions without exceeding the LLM's context window.
# Store a summary manually
await mm.store_conversation_summary(
conversation_id=42,
summary="User and Luna discussed the Rust borrow checker for 20 minutes.",
)
# Retrieve recent conversation context
recent = mm.get_recent_conversation(limit=6)
for msg in recent:
print(f"{msg.role}: {msg.content[:80]}")Contradiction detection
When a new fact is stored, a background task calls the LLM to check whether it contradicts any existing fact in the same category. If a contradiction is found, the older fact is deactivated and a ContradictionNote record is created.
This happens asynchronously — store_fact() returns immediately and the contradiction check runs in the background.
Memory types
| Type | Lifespan | Use for |
|---|---|---|
long | Permanent (until compacted) | Preferences, goals, identity facts. |
short | Until expires_at | Current task context, temporary state. |
Short-term facts are automatically pruned by _expire_stale_facts()which runs every time a new fact is stored.