Memory Manager

Long-term fact storage, ChromaDB semantic search, conversation summarisation, and background contradiction detection.

Overview

MemoryManager is the central memory layer. It combines:

  • A SQLite facts table — structured, queryable, persistent facts extracted from conversations.
  • A ChromaDB collection (luna_facts) — embedding vectors for semantic similarity search.
  • A conversation history buffer — the last N messages injected as context.

One instance is created per request and shares the same SQLAlchemy session as the router.

FileContents
memory_manager/manager.pyFull MemoryManager class — single module (tight self.db coupling).
memory_manager/__init__.pyRe-exports MemoryManager.

Instantiation

example.py
from backend.services.memory_manager import MemoryManager
from backend.models.database import SessionLocal

db = SessionLocal()
mm = MemoryManager(db)

# ... use mm ...

db.close()
💡
In FastAPI routes, use db: Session = Depends(get_db) — the session is managed by the dependency injector. Never share a MemoryManager instance across requests.

store_fact()

Persists a fact to SQLite and embeds it into ChromaDB.

signature
async def store_fact(
    content: str,
    category: str,           # "preference" | "goal" | "behavior" | "identity" | "context" | ...
    memory_type: str = "long",  # "long" | "short"
    importance: float = 0.7,    # 0.0 – 1.0
    confidence: float = 0.85,
    source: str = "conversation",
    expires_at: datetime | None = None,
) -> int  # returns the new fact's database ID

Categories

CategoryExample
preference"User prefers dark mode"
goal"User wants to learn Rust this year"
behavior"User is frequently active late at night"
identity"User is a backend engineer at a startup"
context"User is working on a FastAPI project called Luna"
routine"User starts work at 9 AM on weekdays"

Usage

example.py
# Store a long-term preference
fact_id = await mm.store_fact(
    "User strongly prefers Python over JavaScript",
    category="preference",
    importance=0.9,
)

# Store a short-lived context fact with expiry
from datetime import datetime, timedelta
await mm.store_fact(
    "User is currently debugging a memory leak",
    category="context",
    memory_type="short",
    expires_at=datetime.utcnow() + timedelta(hours=2),
)

retrieve_relevant()

Returns the most semantically relevant active facts for a query using ChromaDB cosine similarity, with a SQLite text-search fallback if embeddings are unavailable.

signature
async def retrieve_relevant(
    query: str,
    limit: int = 6,
) -> list[Fact]  # SQLAlchemy Fact ORM objects
example.py
facts = await mm.retrieve_relevant("what programming languages does the user know?")
for fact in facts:
    print(f"[{fact.category}] {fact.content}  (confidence={fact.confidence:.2f})")
📌
The retrieval count is configurable via MEMORY_RETRIEVAL_COUNT in .env (default: 6).

get_core_facts()

Returns high-importance, high-confidence facts that are always injected into the system prompt regardless of query relevance. These are the "always know" facts — things like the user's name, profession, and key goals.

signature
def get_core_facts(limit: int = 10) -> list[Fact]
example.py
core = mm.get_core_facts()
for f in core:
    print(f.content)

get_conversation_context()

Builds the full context string that is injected before the LLM system prompt. Combines core facts, relevant facts, and recent conversation summary.

signature
async def get_conversation_context(
    query: str,
    conversation_id: int | None = None,
) -> str

What it includes

  1. Core facts (always-on, high importance).
  2. Semantically retrieved facts for the current query.
  3. The most recent conversation summary (if one exists).
example.py
context = await mm.get_conversation_context(
    query=user_message,
    conversation_id=current_conversation_id,
)
# Prepend to system prompt
full_system = context + "\n\n" + base_system_prompt

compact_facts()

Removes duplicate and contradicted facts using an LLM pass. Runs daily at 3 AM via the scheduler, but can be called manually.

signature
async def compact_facts() -> int  # returns number of facts removed
example.py
removed = await mm.compact_facts()
print(f"Removed {removed} redundant facts")

Conversation summaries

After every N messages, the chat pipeline stores a compressed summary of the conversation as a memory record. This allows Luna to maintain context across very long sessions without exceeding the LLM's context window.

example.py
# Store a summary manually
await mm.store_conversation_summary(
    conversation_id=42,
    summary="User and Luna discussed the Rust borrow checker for 20 minutes.",
)

# Retrieve recent conversation context
recent = mm.get_recent_conversation(limit=6)
for msg in recent:
    print(f"{msg.role}: {msg.content[:80]}")

Contradiction detection

When a new fact is stored, a background task calls the LLM to check whether it contradicts any existing fact in the same category. If a contradiction is found, the older fact is deactivated and a ContradictionNote record is created.

This happens asynchronously — store_fact() returns immediately and the contradiction check runs in the background.

Memory types

TypeLifespanUse for
longPermanent (until compacted)Preferences, goals, identity facts.
shortUntil expires_atCurrent task context, temporary state.

Short-term facts are automatically pruned by _expire_stale_facts()which runs every time a new fact is stored.