architecture walkthrough

How Engram
works.

A high-level walkthrough of the architecture, data flow, and the retrieval strategies that make Engram different from a vector database.

Architecture overview

Five layers, one SQLite file

REST API
POST /v1/memories · GET /v1/recall
Vault API
remember() · recall() · consolidate()
Retrieval Engine
Entity + Topic + Vector + FTS
Consolidation Engine
LLM-powered sleep cycles
SQLite Storage
Memories · Edges · Entities · Embeddings

Every layer runs locally. No external services required for core functionality.

Where your data lives

Your device. Your data.

~/.engram/

Everything lives in a single SQLite database on your machine. Memories, embeddings, entities, edges — all in one portable file.

SQLite + sqlite-vec for embeddings
No external vector database
Export entire vault as JSON anytime
Zero cloud dependency

The core product runs entirely on your machine. No accounts, no API keys for storage, no data leaving your device.

LLM key needed only for consolidation
Works offline for store & recall
Engram Cloud is optional

The memory lifecycle

From raw input to refined knowledge

Remember

Agent stores a fact, preference, or observation. Entities and topics are extracted automatically.

Store

Memory is embedded, indexed, and linked to existing entities in the knowledge graph.

Recall

Queries combine entity matching, topic matching, vector search, and spreading activation.

Consolidate

Sleep cycles distill raw episodes into structured semantic knowledge using an LLM.

Knowledge

Refined memories surface proactively. Contradictions are detected and resolved.

Stale memories decay & archive automatically

Three pillars deep dive

What makes memory intelligent

Knowledge Graph

Memories aren’t isolated documents — they’re nodes in a graph connected by typed edges. When you store a new memory, Engram links it to existing knowledge automatically.

# Edge types
supportsreinforces an existing memory
contradictsconflicts with prior knowledge
elaboratesadds detail to existing memory
supersedesreplaces outdated information
causes / caused_bycausal relationships
part_of / instance_ofhierarchical links

Sleep Cycles

Like how your brain consolidates memories during sleep, Engram’s consolidation engine uses an LLM to distill raw episodes into structured semantic knowledge.

Input — Raw Episodes
“Thomas ran a marathon last week”
“Thomas trains every morning at 6am”
“Thomas prefers trail running”
Output — Semantic Knowledge
“Thomas is an avid runner who trains mornings at 6am, prefers trails, and recently completed a marathon”

Spreading Activation

A query activates matching nodes, then energy spreads through the graph along weighted edges — surfacing context you didn’t know to ask for.

"Thomas"direct match
marathonentity connection
training scheduleco-occurrence
prefers morning runslinked memory

Query “Thomas” → recall cascades through the graph, surfacing “prefers morning runs” without explicitly asking about running.

Retrieval strategy

Four signals, one ranked result

Every recall query runs four retrieval strategies in parallel, then merges and re-ranks the results. This is why Engram outperforms pure vector search by 15+ points.

Entity Matching

Finds memories mentioning the same people, projects, or concepts as your query.

Topic Matching

Filters by topic tags extracted when memories are stored.

Vector Search

sqlite-vec embeddings find semantically similar memories even with different wording.

Spreading Activation

Follows knowledge graph edges to surface context you didn’t know to ask for.

Entity+Topic+Vector+Graph
Merge & re-rankTop-k results

Ready to give your agents memory?