Phase 3 — The Black Box (Memory)
Phase 3 — The Black Box (Memory)
Section titled “Phase 3 — The Black Box (Memory)”Status: 🔨 3.A–3.D.5 shipped; 3.E mixed closeout
Deliverable: SQLite + FTS5 + ontological graph + semantic fusion in Go; 3.E closes auditability, decay, cross-chat synthesis, and the GONCHO-shaped session/user boundaries the future plugin layer will depend on while preserving Honcho-compatible interfaces.
Completion lane: Phase 3 is the base of Lane 2 — Goncho Memory And Honcho Compatibility.
It is not complete until Goncho can cover the Honcho-compatible public memory
contracts Gormes needs: sessions, messages, scoped search, provenance,
lineage, updates/deletes, and fixture-backed honcho_* tool/MCP names where
external compatibility requires them.
Phase 3 (The Black Box) is substantially delivered as of 2026-04-25: the SQLite + FTS5 lattice (3.A), ontological graph with async LLM extraction (3.B), lexical/FTS5 recall with <memory-context> fence injection (3.C), semantic fusion via Ollama embeddings with cosine similarity recall (3.D), and the operator-facing memory mirror (3.D.5) are all implemented. The 3.E closeout queue is now mixed: session index mirror (3.E.1), tool audit (3.E.2), transcript export (3.E.3), extraction visibility (3.E.4), the lightweight insights writer (3.E.5), the last_seen-based memory-decay closeout (3.E.6), interrupted-turn sync suppression, Honcho-compatible scope/source schemas, deny-path fixtures, and baseline host-integration fixtures are shipped; canonical user_id > chat_id > session_id metadata is landed for 3.E.7, and the core source-filtered session/message search path is landed for 3.E.8. New Honcho drift adds a narrower SillyTavern mapping gate for peer modes, session naming, enrichment modes, and group-chat participants before operator evidence can close 3.E.7. parent_session_id lineage remains the last explicit donor seam in this area. Architecturally, this is the phase where Gormes finishes the memory substrate that a GONCHO-style integration would stand on, without yet claiming full Honcho provider or plugin parity.
Phase 3 sub-status (as of 2026-04-25)
Section titled “Phase 3 sub-status (as of 2026-04-25)”- 3.A — SQLite + FTS5 Lattice — ✅ implemented (
internal/memory,SqliteStore, FTS5 triggers, fire-and-forget worker, schema v3a→v3d migrations) - 3.B — Ontological Graph + LLM Extractor — ✅ implemented (
Extractor, entity/relationship upsert, dead-letter queue, validator with weight-floor patch) - 3.C — Neural Recall + Context Injection — ✅ implemented (
RecallProvider, 2-layer seed selection, CTE traversal,<memory-context>fence matching Python’sbuild_memory_context_block) - 3.D — Semantic Fusion + Local Embeddings — ✅ implemented (
entity_embeddingstable with L2-normalized float32 LE BLOBs;Embedderbackground worker calls Ollama/v1/embeddingswith labeled templateEntity: {Name}. Type: {Type}. Context: {Description}; in-memory vector cache with monotonic graph-version counter;semanticSeedsflat cosine scan (dot product on normalized vectors); hybrid fusion inProvider.GetContextchains lexical → FTS5 → semantic with dedup + MaxSeeds cap; opt-in viasemantic_enabled=true+semantic_model="<tag>"; empty model is a complete no-op — zero HTTP calls, zero goroutine, zero cache RAM. Ship criterion proven live against Ollama: query"tell me about my projects"(no lexical match) surfaces the seeded project entity via cosine in 7s.) - 3.D.5 — Memory Mirror (USER.md sync) — ✅ implemented (async background goroutine exports SQLite entities/rels → Markdown every 30s; configurable path; atomic writes; SQLite remains source of truth; zero impact on 250ms latency moat)
- 3.E — Decay + Cross-Chat + Operational Mirrors — 🔨 mixed closeout (3.E.1–3.E.6 are shipped; 3.E.7 and 3.E.8 are still in progress with core code already landed in parts of 3.E.7/3.E.8)
Phase 3.E Ledger
Section titled “Phase 3.E Ledger”Phase 3.E is the final Black Box milestone. It closes four orthogonal gaps: operational visibility (session index, tool audit, transcript export, extractor status), memory decay (old facts fade), cross-chat synthesis (one user, multiple chats, one graph), and the remaining SessionDB donor seams (parent_session_id lineage plus cross-source search). Each row is a separable spec.
| Subphase | Status | Priority | Upstream reference | Deliverable |
|---|---|---|---|---|
| 3.E.1 — Session Index Mirror | ✅ shipped | P0 | None (Gormes-original) | Read-only YAML mirror of bbolt sessions.db at ~/.local/share/gormes/sessions/index.yaml; deterministic background refresh now runs from the TUI, Telegram, and shared gateway entrypoints without rewriting unchanged snapshots |
| 3.E.2 — Tool Execution Audit Log | ✅ shipped | P0 | None (exceeds Hermes) | Append-only JSONL at ~/.local/share/gormes/tools/audit.jsonl; persistent record of every tool call with timing + outcome |
| 3.E.3 — Transcript Export Command | ✅ shipped | P2 | Exceeds Hermes (no upstream equivalent) | gormes session export <id> --format=markdown renders SQLite turns as human-readable Markdown; snapshot for sharing/backup |
| 3.E.4 — Extraction State Visibility | ✅ shipped | P1 | None (debug only) | gormes memory status shows extractor queue depth, dead-letter summaries, and worker-health heuristics |
| 3.E.5 — Insights Audit Log | ✅ shipped | P3 | agent/insights.py (preview) | Local telemetry.Snapshot rollups plus append-only usage.jsonl persistence are landed |
| 3.E.6 — Memory Decay | ✅ shipped | P1 | None (Gormes-original) | Relationship freshness now tracks last_seen through a v3g schema/backfill, writer upserts advance it independently of updated_at, and recall-time attenuation uses COALESCE(NULLIF(last_seen, 0), updated_at) |
| 3.E.7 — Cross-Chat Synthesis | 🔨 in progress | P2 | agent/memory_manager.py (cross-session) + SessionDB.user_id + Honcho/SillyTavern host integration docs | internal/persistence/session persists canonical user_id > chat_id > session_id metadata, and internal/memory, internal/goncho, and internal/gonchotools now cover same-chat default fencing, opt-in user/source-filtered recall, interrupted-turn suppression, Honcho-compatible scope/source schemas, deny paths, and baseline host mappings; remaining slices are SillyTavern persona/group-chat mapping fixtures followed by operator-readable cross-chat evidence |
| 3.E.8 — Session Lineage + Cross-Source Search | 🔨 in progress | P4 | hermes_state.py (parent_session_id, search_messages, search_sessions) | Source-filtered session/message search is landed via internal/memory/session_catalog.go, and the internal GONCHO service accepts scope=user / sources[]; parent_session_id lineage, lineage-aware hits, and operator evidence still remain |
The 3.E ship criterion: the operator runs cat ~/.local/share/gormes/sessions/index.yaml and sees every active chat/session mapping in plain YAML; runs cat ~/.local/share/gormes/tools/audit.jsonl and sees a full history of tool invocations; a fact mentioned once six months ago and never again no longer dominates recall results; asking the same question across two different chats surfaces the same entity graph; and context-compressed branches no longer disappear into opaque IDs because lineage and source-filtered search are queryable.
Memory Lessons Now Imported
Section titled “Memory Lessons Now Imported”The strongest memory lesson is provenance. Knowledge pages store chunks, links, timeline entries, versions, and ingest logs so retrieval can explain where a claim came from. Gormes should apply that idea to the local SQLite graph without adopting external Postgres-first storage model.
Relationship and entity writes should carry enough evidence to debug recall:
- source turn or source artifact;
- extractor version;
- evidence text hash;
- confidence;
- first seen and last seen;
- provenance kind: manual, extracted, imported, inferred, or reviewed;
- review/promote state where an edge can affect cross-chat recall.
The default product rule is same-chat recall first, opt-in user-scope widening second. Cross-chat synthesis must be visible and auditable because memory bugs become privacy bugs once Telegram, Discord, Slack, email, and API sessions can share a peer identity.
Retrieval Evaluation And Degraded Health
Section titled “Retrieval Evaluation And Degraded Health”Hermes and the Go donor references both degrade when optional capability is missing. Gormes should degrade visibly. Phase 3 docs and tests should define an operator-facing health contract for:
- semantic recall disabled because no embedding model is configured;
- extractor queue depth and dead-letter groups;
- stale graph extraction age;
- relationship decay/freshness behavior;
- cross-chat deny paths;
- source-filtered search and lineage gaps.
Before adding more recall layers, add retrieval fixtures:
- seed conversations and entity facts;
- define expected recall snippets or entity IDs;
- run lexical, FTS, graph, semantic, and fused modes;
- include negative tests for cross-chat leakage;
- report why each selected seed entered the prompt fence.
TDD Priority Queue
Section titled “TDD Priority Queue”The Phase 3 queue is not one flat backlog. The order matters because later memory features need operator visibility and stable identity seams before they can be debugged safely.
- P3 — SillyTavern persona and group-chat mapping fixtures
Honcho upstream expanded the SillyTavern integration contract. Gormes should not port the Node/browser extension, but it should fixture-lock the shared concepts it depends on: peer modes, auto/per-character/custom session naming, reset/orphan behavior, group-chat one-peer-per-character mapping, lazy-added characters, and context/reasoning/tool-call enrichment modes while keeping the internal package named
goncho. - P2 — 3.E.7 operator-readable cross-chat evidence
The
user_idmerge rules, same-chat recall fence, opt-in user-scope/source-filtered recall, interrupted-turn suppression, Honcho-compatible tool schemas, deny paths, and baseline host compatibility fixtures are pinned. The remaining closeout is status/dry-run evidence that explains allowed, denied, and degraded cross-chat recall decisions, including SillyTavern host-mapping gaps. - P4 — 3.E.8
parent_session_idlineage closeout Source-filtered session/message search is now landed; the remaining donor gap with HermesSessionDBis compression lineage plus lineage-aware search/evidence, which still pairs naturally with later context-compression work and should come after the operator-facing mirrors are stable.
Execution blueprint (2026-04-22)
Section titled “Execution blueprint (2026-04-22)”The delivery sequence is frozen in docs/superpowers/plans/memory-identity/2026-04-22-gormes-phase3-identity-lineage-execution-plan.md, but the remaining cross-chat closeout is now tracked as smaller slices:
3.E.7 SillyTavern persona/group-chat mapping fixtures -> 3.E.7 operator evidence -> 3.E.8 parent_session_id -> 3.E.8 lineage-aware hits/evidence
The original identity-lineage dependency backbone is now mostly shipped through 3.E.7 schema exposure, 3.E.7 deny-path fixtures, and baseline 3.E.7 host compatibility fixtures. The new SillyTavern fixture is the remaining host-specific gate before 3.E.7 can be called fully closed.
That order is intentional even though some enabling code is already landed:
3.E.6.1is now landed via schema v3g, relationshiplast_seenbackfill, writer freshness updates, and recall fallback coverage.3.E.7 interrupted-turn sync suppression, schema exposure, deny paths, and baseline host compatibility are now landed, so new work should not reopen those contracts.3.E.7 SillyTavern persona/group-chat mapping fixturesgo next because Honcho’s latest host docs add peer/session/group semantics that can widen recall if mapped incorrectly.3.E.7 operator evidencecloses the cross-chat audit surface only after the SillyTavern host-specific mapping gap is pinned or explicitly degraded.3.E.8 parent_session_idadds lineage semantics after the recall fence is proven safe.3.E.8 lineage-aware hits/evidencecloses the remaining lineage-aware session search and operator evidence work last.
Current code is ahead of the old narrative in internal/memory/recall.go, internal/memory/session_catalog.go, internal/goncho/service.go, and internal/gonchotools/honcho_tools.go; the ledger stays conservative until the new SillyTavern host mapping, lineage metadata, and operator-auditable surfaces all line up.
Identity + lineage architecture freeze (2026-04-22)
Section titled “Identity + lineage architecture freeze (2026-04-22)”Before this plan, 3.E.7 and 3.E.8 were only coarse placeholders in the ledger and this page: current code had chat_id plus session_id, but no durable user_id or parent_session_id contract, and internal/memory/recall.go still allowed exact-name recall to cross chat boundaries when an entity was named directly. The original design remains documented in docs/superpowers/plans/memory-identity/2026-04-22-gormes-phase3-identity-lineage-plan.md. The user_id, recall-fence, Honcho-compatible scope/source schema, deny-path, and baseline host-integration halves are now landed via internal/persistence/session, internal/memory, internal/goncho, and internal/gonchotools; the remaining implementation target is SillyTavern-specific host mapping plus operator evidence, followed by parent_session_id lineage and adjacent cross-chat consumers. Current code is split across three layers: internal/persistence/session owns canonical bindings, internal/memory owns scoped recall/search, and the internal GONCHO service plus honcho_* tools accept scope=user and sources[].
The frozen contract is:
- Canonical GONCHO identity hierarchy is
user_id > chat_id > session_id. - Recall stays
same-chat default, opt-in cross-chat. parent_session_idis append-only lineage metadata for compression/fork descendants; roots remain null.- Source-filtered search runs across sessions for one canonical
user_id, not by flattening all chats into one undifferentiated stream.
This matters because the current memory substrate is already strong enough to make a bad identity decision expensive: once facts, conclusions, and tool-visible context start spanning multiple chats, any ambiguity around “who” a chat belongs to becomes a correctness and privacy problem rather than a mere schema nuisance.
Go donor pointers
Section titled “Go donor pointers”Before writing a new Phase 3 slice, route through the gormes-references
skill (docs/development-skills/gormes-references/SKILL.md) to find the
donor file that already shapes the seam.
| Phase 3 problem | Donor file | Notes |
|---|---|---|
| 3.A SQLite + FTS5 store with DDL/indexes/migrations | engram/internal/persistence/store/store.go | MIT; adapt with attribution |
3.B relation/conflict vocabulary (related, conflicts_with, supersedes, compatible, scoped, not_conflict) | engram/internal/persistence/store/relations.go | Provenance-rich edge schema |
| 3.E.2 append-only tool/audit JSONL with redaction | engram/internal/mcp/activity.go | Already mirrors Gormes’ audit shape |
| 3.F Goncho serialized write queue (mutex-serialized, cancel-before-start) | engram/internal/mcp/write_queue.go | Foundation for Goncho conflict-detection writes |
| 3.E recall budget tracker (bounded recall context) | axe/internal/budget/budget.go | Per-turn counter + overflow signal |
| 3.E.3 transcript export — sanitized artifact paths | axe/internal/artifact/tracker.go | Path-traversal guard, append-only registry |
Engram is MIT-licensed and is the densest donor for memory work. Always add
// Adapted from engram/...::Symbol provenance comments and convert types so
no engram_* symbol leaks into Gormes’ public API.
Pre-Phase 4 E2E Gate (Hermes still running)
Section titled “Pre-Phase 4 E2E Gate (Hermes still running)”Before starting Phase 4 implementation work, run and freeze a hybrid end-to-end baseline while Hermes is still the upstream brain (api_server) and Gormes owns runtime/gateway/memory surfaces.
Why this gate exists
Section titled “Why this gate exists”- It gives a parity reference before the Brain Transplant introduces new failure modes.
- It separates “bridge baseline regressions” from “native orchestrator regressions.”
- It locks operator-facing contracts (routing, tool loop, memory fence shape, delivery semantics) before replacing the Python core path.
Required E2E scenarios
Section titled “Required E2E scenarios”- Gateway routing path: inbound platform event -> session resolution -> kernel turn -> outbound delivery.
- Tool-call loop path: model requests tool(s) -> tool execution -> tool result continuation -> final assistant output.
- Delegation path:
delegate_taskchild execution, allowlist/blocked-tool enforcement, terminal result envelope. - Memory path: recall injection fence present (
<memory-context>), expected seed/fact shape, no leak across fenced scope. - Operator visibility path: session index/tool audit/transcript export surfaces produce deterministic artifacts.
Exit criteria (must pass)
Section titled “Exit criteria (must pass)”- E2E suite is green in CI and locally against Hermes-backed runtime.
- Golden outputs are stored for key contract surfaces (memory fence + export format + delivery envelope).
- Known acceptable divergences are documented explicitly (none implied by omission).
- A “Phase 4 can start” note is added to the sprint/plan artifact with exact command set used.
This gate is a prerequisite for Phase 4 powertrain work, not optional polish.
GONCHO architecture as the internal reference model
Section titled “GONCHO architecture as the internal reference model”Internally, Gormes refers to the local memory-service seam as GONCHO. The exported tool surface remains Honcho-compatible interfaces (honcho_*) so callers do not lose the upstream mental model while the Go substrate hardens.
GONCHO is useful here not just as a future plugin target, but as a clean reference architecture for what an agent-memory system needs to separate:
- Workspace — namespace and tenancy boundary for all memory objects
- Peer — the durable identity being modeled (human, agent, or other participant)
- Session — the temporal boundary of one conversation/thread/import
- Background derivation — async pipelines that convert raw messages into representations and summaries
- Two read paths — a fast session-context path for prompt assembly and a slower dialectic path for natural-language introspection over a peer’s memory
That decomposition matters because it keeps three concerns distinct that are easy to blur in a simpler local-memory design:
- Who is being modeled (
peer) - Where an interaction happened (
session) - How the system reconstructs useful context later (
representation,summary,dialectic)
Gormes Phase 3 is converging toward that same separation in Go. The local SQLite lattice, graph extractor, and semantic recall already cover most of the “representation substrate” layer. The remaining 3.E work closes the gaps around session visibility, cross-chat identity, and decay so that future GONCHO / Honcho parity is a thin integration layer rather than a redesign of the memory core.
GONCHO-to-Gormes mapping
Section titled “GONCHO-to-Gormes mapping”| GONCHO concept | Role in the Honcho-compatible model | Phase 3 implication for Gormes |
|---|---|---|
| Workspace | Top-level namespace containing peers, sessions, and derived memory | Today this is effectively the local Gormes data root plus config scope. Full provider-facing workspace semantics remain a later integration concern, but Phase 3 must keep schemas and mirrors partitionable at that boundary. |
| Peer | Durable participant identity whose representation evolves over time | The closest Phase 3 equivalent is the entity/relationship graph plus USER.md mirror. Phase 3 builds the durable facts; explicit Honcho-style peer objects and peer-management UX are deferred. |
| Session | Bounded conversation/import container with messages and summary slots | 3.E.1 and 3.E.3 exist largely to make this boundary visible: operators need an inspectable session index and export path instead of opaque storage. |
| Shared peer across many sessions | One user can appear in many sessions/channels while preserving a single long-lived representation | 3.E.7 is the direct analogue: introduce a user_id concept above chat_id so Gormes can unify facts across Telegram, Discord, Slack, and future gateways without flattening all sessions into one stream. |
| Background representation + summary pipeline | Messages are written once, then async workers derive peer representations and session summaries | 3.B and 3.D already match the async side of this architecture via extractor and embedder workers. 3.E.4 adds the missing observability so operators can see queue depth and failures. |
session.context() path | Fast, prompt-facing retrieval of summary + recent messages scoped to a session | Gormes already injects <memory-context> fences from lexical/graph/semantic recall. What is still missing is clearer session-boundary visibility and, if desired later, a fuller summary-oriented context contract. |
peer.chat() dialectic path | Higher-level natural-language reasoning over a peer’s learned representation | Not a Phase 3 deliverable. Phase 3 supplies the graph, decay model, and cross-chat identity that such a layer would query later, whether via Honcho plugin parity or a Gormes-native dialectic surface. |
| Observation topology | Asymmetric peer observation (observe_me, observe_others), dynamic agent peers, subagent hierarchies | Important for Honcho integration design, but outside the local memory-core mandate of Phase 3. That belongs with gateway/plugin parity once the substrate is stable. |
Learning-loop import map
Section titled “Learning-loop import map”The learning loop has three different evidence streams, and Phase 3 should keep those streams separate:
- Hermes contract —
hermes-agent/README.mdlines 15 and 22 define the product promise: curated memory nudges, skill creation and improvement, FTS5 session search, and Honcho-backed user modeling.hermes-agent/run_agent.pyimportsbuild_memory_context_blockandStreamingContextScrubber, then wires background memory/skill review near the main turn loop.hermes-agent/toolsets.pyexposesmemoryplusskills_list,skill_view, andskill_manageas the restricted procedural/declarative memory tool lane. - Honcho/GONCHO model —
../honcho/README.mdframes memory as workspace, peer, session, representation, summary, search, and dialectic context. The local Go usage surface is already source-backed ininternal/gonchotools:honcho_profile,honcho_search,honcho_context,honcho_reasoning, andhoncho_concludeare registered tools, whileinternal/memory/goncho_v1.goand its tests pin markdown/MCP contract versions, scopes, tombstones, checksums, and private/shared recall boundaries. - OpenClaw donor evidence — OpenClaw active memory is not the parity
contract, but it supplies useful operator-facing safeguards: bounded hidden
recall,
memory_search/memory_getQA scenarios, graceful no-plugin degradation, per-conversation allow/deny filters, timeout partial summaries, lazy QMD startup, and memory-status diagnostics. Those ideas should become Gormes-owned enhancements only when they preserve Hermes-visible behavior.
The immediate planner rule is: if a future slice changes recall behavior,
background review, or honcho_* tool semantics, cite the Hermes/Honcho source
above as the contract and cite OpenClaw only as an optional hardening donor. A
valid implementation slice should name the exact Gormes seam first: extractor,
recall provider, session catalog, Goncho V1 markdown/MCP contract, or
internal/gonchotools tool adapter.
Scope guard
Section titled “Scope guard”Phase 3 should therefore be read as GONCHO-aligned substrate work, not “port all of Honcho now.”
- If the work is about local persistence, graph formation, semantic recall, session inspectability, cross-chat identity, or decay, it belongs in Phase 3.
- If the work is about Honcho API parity, peer-management commands, plugin wiring, dialectic tools, or remote workspace/session orchestration, it belongs later with the provider/plugin surface.
This boundary is deliberate. Phase 3 makes Gormes memory structurally compatible with the Honcho-style architecture without paying the full provider-integration cost before the local Go memory core is finished.