Memory Manager — L.U.N.A. Docs

Overview

MemoryManager is the central memory layer. It combines:

A SQLite facts table — structured, queryable, persistent facts extracted from conversations.
A ChromaDB collection (luna_facts) — embedding vectors for semantic similarity search.
A conversation history buffer — the last N messages injected as context.

One instance is created per request and shares the same SQLAlchemy session as the router.

File	Contents
`memory_manager/manager.py`	Full `MemoryManager` class — single module (tight self.db coupling).
`memory_manager/__init__.py`	Re-exports `MemoryManager`.

Instantiation

example.py

from backend.services.memory_manager import MemoryManager
from backend.models.database import SessionLocal

db = SessionLocal()
mm = MemoryManager(db)

# ... use mm ...

db.close()

💡

In FastAPI routes, use db: Session = Depends(get_db) — the session is managed by the dependency injector. Never share a MemoryManager instance across requests.

store_fact()

Persists a fact to SQLite and embeds it into ChromaDB.

signature

async def store_fact(
    content: str,
    category: str,           # "preference" | "goal" | "behavior" | "identity" | "context" | ...
    memory_type: str = "long",  # "long" | "short"
    importance: float = 0.7,    # 0.0 – 1.0
    confidence: float = 0.85,
    source: str = "conversation",
    expires_at: datetime | None = None,
) -> int  # returns the new fact's database ID

Category	Example
`preference`	"User prefers dark mode"
`goal`	"User wants to learn Rust this year"
`behavior`	"User is frequently active late at night"
`identity`	"User is a backend engineer at a startup"
`context`	"User is working on a FastAPI project called Luna"
`routine`	"User starts work at 9 AM on weekdays"

Usage

example.py

# Store a long-term preference
fact_id = await mm.store_fact(
    "User strongly prefers Python over JavaScript",
    category="preference",
    importance=0.9,
)

# Store a short-lived context fact with expiry
from datetime import datetime, timedelta
await mm.store_fact(
    "User is currently debugging a memory leak",
    category="context",
    memory_type="short",
    expires_at=datetime.utcnow() + timedelta(hours=2),
)

retrieve_relevant()

Returns the most semantically relevant active facts for a query using ChromaDB cosine similarity, with a SQLite text-search fallback if embeddings are unavailable.

signature

async def retrieve_relevant(
    query: str,
    limit: int = 6,
) -> list[Fact]  # SQLAlchemy Fact ORM objects

example.py

facts = await mm.retrieve_relevant("what programming languages does the user know?")
for fact in facts:
    print(f"[{fact.category}] {fact.content}  (confidence={fact.confidence:.2f})")

📌

The retrieval count is configurable via MEMORY_RETRIEVAL_COUNT in .env (default: 6).

get_core_facts()

Returns high-importance, high-confidence facts that are always injected into the system prompt regardless of query relevance. These are the "always know" facts — things like the user's name, profession, and key goals.

signature

def get_core_facts(limit: int = 10) -> list[Fact]

example.py

core = mm.get_core_facts()
for f in core:
    print(f.content)

get_conversation_context()

Builds the full context string that is injected before the LLM system prompt. Combines core facts, relevant facts, and recent conversation summary.

signature

async def get_conversation_context(
    query: str,
    conversation_id: int | None = None,
) -> str

What it includes

Core facts (always-on, high importance).
Semantically retrieved facts for the current query.
The most recent conversation summary (if one exists).

example.py

context = await mm.get_conversation_context(
    query=user_message,
    conversation_id=current_conversation_id,
)
# Prepend to system prompt
full_system = context + "\n\n" + base_system_prompt

compact_facts()

Removes duplicate and contradicted facts using an LLM pass. Runs daily at 3 AM via the scheduler, but can be called manually.

signature

async def compact_facts() -> int  # returns number of facts removed

example.py

removed = await mm.compact_facts()
print(f"Removed {removed} redundant facts")

Conversation summaries

After every N messages, the chat pipeline stores a compressed summary of the conversation as a memory record. This allows Luna to maintain context across very long sessions without exceeding the LLM's context window.

example.py

# Store a summary manually
await mm.store_conversation_summary(
    conversation_id=42,
    summary="User and Luna discussed the Rust borrow checker for 20 minutes.",
)

# Retrieve recent conversation context
recent = mm.get_recent_conversation(limit=6)
for msg in recent:
    print(f"{msg.role}: {msg.content[:80]}")

Contradiction detection

When a new fact is stored, a background task calls the LLM to check whether it contradicts any existing fact in the same category. If a contradiction is found, the older fact is deactivated and a ContradictionNote record is created.

This happens asynchronously — store_fact() returns immediately and the contradiction check runs in the background.

Memory types

Type	Lifespan	Use for
`long`	Permanent (until compacted)	Preferences, goals, identity facts.
`short`	Until `expires_at`	Current task context, temporary state.

Short-term facts are automatically pruned by _expire_stale_facts()which runs every time a new fact is stored.

Memory Manager

Overview

Instantiation

store_fact()

Categories

Usage

retrieve_relevant()

get_core_facts()

get_conversation_context()

What it includes

compact_facts()

Conversation summaries

Contradiction detection

Memory types