Implementation Roadmap
Implementation Roadmap
Section titled “Implementation Roadmap”Purpose: This is the single planning entry point for building gormes-agent. It tells you where we are, what is blocked, what comes next, and which document to read for the details. If you are a planner, builder, or reviewer, start here before touching
progress.jsonor writing code.Relationship to other docs:
architecture_plan/progress.json= the canonical execution queue (machine-readable)- Completion Plan = the finish line definition
- Lane Roadmap = lane ownership and exit gates
- Must-Have Features = feature catalogue from 12+ upstream projects
- This document = the human-readable plan that ties them all together
Current State at a Glance
Section titled “Current State at a Glance”As of May 5, 2026
| Phase | Status | Shipped | In Progress | Planned | Open Rows |
|---|---|---|---|---|---|
| Phase 1 — Dashboard | ✅ Complete | 4/4 | 0 | 0 | 0 |
| Phase 2 — Gateway | ✅ Complete | 21/21 | 0 | 0 | 0 |
| Phase 3 — Memory | ✅ Complete | 15/15 | 0 | 0 | 0 |
| Phase 4 — Brain Transplant | 🔨 Mostly Complete | 12/13 | 1 | 0 | 1 |
| Phase 5 — Final Purge | 🔨 Large Backlog | 5/22 | 16 | 1 | 47 |
| Phase 6 — Learning Loop | 🔨 In Progress | 6/12 | 2 | 4 | 14 |
| Phase 7 — Paused Channels | 🔨 Backlog | 2/5 | 3 | 0 | 9 |
Overall: 65/92 subphases shipped · 22 in progress · 5 planned · 71 open rows
First closure target: Python-free normal agent turn with local Goncho memory and tested tool-call continuation. This is a dogfood gate, not a reduced finish line.
Fleet Operational Patterns Borrow List
Section titled “Fleet Operational Patterns Borrow List”From the Fleet Operational Patterns analysis of the 6-agent sages-openclaw ecosystem plus OpenClaw 2026.3.28 platform. These are Gormes-owned operational quality slices — not Hermes parity blockers, but foundational for production fleet operation.
| Slice | Target | Priority | Phase |
|---|---|---|---|
| Blocker policy | Fleet-standard blocker classification (access/infra/dependency/decision/bug/unknown), dual-record, auto-pivot, surfaced in gormes status. | P0 | 5.N |
| Session health monitoring | gormes health with session size tiers (500KB/2MB), heartbeat freshness (45min/90min), Goncho queue depth. | P0 | 5.N |
| Evidence-before-claims quality gate | Doctor output uses computed exact counts (pass/fail/skip), never hardcoded narrative. | P0 | 5.O |
| Git delivery contract enforcement | Builder skill enforces clean tree + pushed commits before declaring row complete. | P1 | 5.N |
| QMD hybrid search | gormes search with BM25 + optional vector search across workspace docs. | P1 | 5.N |
| Session rollover automation | gormes session rollover at configurable threshold (default 1500KB) with handoff summary. | P1 | 5.N |
| Sandbox policy explain | gormes sandbox explain showing effective trust class, allowlist, and scope. | P1 | 5.B |
| ACP bridge | Session-based agent communication protocol for interoperability. | P1 | 5.H |
| First-run setup/readiness | gormes setup plus gormes doctor --offline --target terminal --json: model → provider → auth → gateway → browser → skills → dashboard. | P1 | 5.O |
| Agent hooks registry | gormes hooks with list/enable/disable/check/info at runtime. | P2 | 5.I |
| Plugin marketplace + doctor | ClawHub-compatible marketplace, plugin load reporting, WASM sandbox for third-party. | P2 | 5.I |
| Logs command | gormes logs with follow mode (-f) and level filtering. | P2 | 5.O |
Full implementation plan: Fleet Integration Plan.
PicoClaw Product Hardening Borrow List
Section titled “PicoClaw Product Hardening Borrow List”These are Gormes-owned follow-up slices from the current PicoClaw comparison. They are not Hermes parity blockers, but they matter for distribution quality and constrained-machine adoption.
| Slice | Target |
|---|---|
| Pre-compiled binaries | Tag-driven GitHub Release workflow emits static Linux/macOS/Windows amd64+arm64 archives with SHA-256 checksums; signing and package-manager manifests remain follow-up release-hardening work. |
| Setup/readiness wizard | Keep public first-run work on gormes setup and machine-readable readiness on gormes doctor --offline --target terminal --json, covering model/provider, auth, gateway, browser/CDP, skills, and dashboard launch. |
| Hardware matrix | Maintain using-gormes/hardware as the tested-device matrix for x86_64, ARM64, Raspberry Pi-class boards, low-memory Linux hosts, and Android/Termux-style environments, with binary size and steady-state RSS recorded per release. |
| Lite build profiles | Keep the default parity build feature-complete, and keep -tags gormes_lite / -tags slim green as documented constrained-target builds that can exclude audio, dashboard extras, and optional channel adapters. |
| Browser launcher | Keep gormes dashboard as the local launcher path and extend it with first-run CDP checks, Chrome install guidance, and an explicit headless/no-open mode for servers. |
| Skill marketplace | Design a ClawHub-like community skill source for Gormes that keeps bundled system skills separate from third-party taps, trust metadata, credential prerequisites, and review state. |
Document Map
Section titled “Document Map”The building-gormes/ directory contains 13 entries. Here is how they relate:
implementation-roadmap.md (this file) ──► decision tree + state + horizons │ ├── must-have-features.md ──► feature catalogue from 12+ upstream projects │ └── cross-project-feature-map.md ──► detailed per-project matrix │ ├── architecture_plan/ ──► phases, completion plan, lanes, progress.json │ ├── completion-plan.md ──► finish line definition │ ├── lane-roadmap.md ──► 6 lanes with exit gates │ ├── progress.json ──► canonical machine-readable queue │ ├── phase-1-dashboard.md ──► phase intent and boundaries │ ├── phase-2-gateway.md │ ├── phase-3-memory.md │ ├── phase-4-brain-transplant.md │ ├── phase-5-final-purge.md │ ├── phase-6-learning-loop.md │ ├── phase-7-paused-channel-backlog.md │ ├── hermes-honcho-feature-map.md ──► upstream → Go package mapping │ ├── hermes-honcho-go-runtime-plan.md ──► reconciled implementation plan │ ├── upstream-coverage-ledger.md ──► source-class completeness audit │ ├── swarm-feature-parity-audit.md ──► sub-agent gap register │ └── ... (20+ more reference docs) │ ├── builder-loop/ ──► execution mechanics │ ├── agent-queue.md ──► generated: builder-ready rows │ ├── next-slices.md ──► generated: ranked shortlist │ ├── blocked-slices.md ──► generated: blocked rows with unblock conditions │ ├── umbrella-cleanup.md ──► generated: rows needing split │ ├── builder-loop-handoff.md ──► skill entrypoint + candidate policy │ └── progress-schema.md ──► row schema reference │ ├── core-systems/ ──► stable runtime model │ ├── gateway.md ──► platform adapters, command policy, session routing │ ├── memory.md ──► recall, graph, search, mirrors, Goncho │ ├── tool-execution.md ──► operation registry, schema, trust classes │ └── learning-loop.md ──► skill detection, distillation, feedback │ ├── gateway-donor-map/ ──► per-channel adaptation patterns │ ├── shared-adapter-patterns.md │ └── 15 channel dossiers (telegram, discord, slack, whatsapp, ...) │ ├── goncho_honcho_memory/ ──► memory subsystem deep-dive │ ├── 01-prompts.md │ ├── 02-tool-schemas.md │ ├── 03-honcho-docs-study.md │ ├── 04-agent-work-packets.md │ └── 05-operator-playbook.md │ ├── upstream-lessons.md ──► durable contracts from Hermes ├── what-hermes-gets-wrong.md ──► why Gormes exists ├── fleet-operational-patterns.md ──► cross-fleet ecosystem analysis (sages + OpenClaw) ├── fleet-integration-plan.md ──► mapping fleet patterns to phases and progress.json rows ├── agent-zero-feature-analysis.md ──► agent0ai/agent-zero architecture study ├── openclaw-platform-parity-audit.md ──► OpenClaw 2026.3.28 full feature surface audit ├── contract-readiness.md ──► row-level handoff contract ├── porting-a-subsystem.md ──► contribution path for upstream ports └── testing.md ──► test strategy and fixture classesRule: If a document contradicts progress.json, progress.json wins. If progress.json contradicts this roadmap, this roadmap wins until a planner updates progress.json.
Decision Tree: What Should I Work On?
Section titled “Decision Tree: What Should I Work On?”Q1: Are you a planner or a builder?
Section titled “Q1: Are you a planner or a builder?”Planner → Go to Agent Queue. If empty, go to Next Slices. If also empty, sharpen a planned row using gormes-planner skill.
Builder → Pick one row from Agent Queue. Read its contract, write_scope, test_commands, acceptance, and done_signal. Use gormes-builder + gormes-tdd-slice skills. Do not invent work outside the row.
Q2: Is the row blocked?
Section titled “Q2: Is the row blocked?”Check Blocked Slices. If your row is listed, read the unblock condition. If you can satisfy it, do so. If not, pick a different row.
Q3: Is the row an umbrella?
Section titled “Q3: Is the row an umbrella?”Check Umbrella Cleanup. Umbrella rows are inventory only. Split them into small/medium/large rows before building. Use gormes-planner for splitting.
Q4: Do you know which lane the row belongs to?
Section titled “Q4: Do you know which lane the row belongs to?”Use the Lane Roadmap lane crosswalk. Each lane has an exit gate. Know the gate before you start.
Q5: Do you know the upstream contract?
Section titled “Q5: Do you know the upstream contract?”Read Upstream Lessons for durable contracts. Read Hermes And Honcho Feature Map for the upstream → Go package mapping. Read Porting a Subsystem for the contribution path.
Q6: Is the Go shape unclear?
Section titled “Q6: Is the Go shape unclear?”Use gormes-interface-designer skill. Read Go Donor Reference Map for donor file patterns.
Q7: Are you doing memory work?
Section titled “Q7: Are you doing memory work?”Read Core Systems: Memory first. Then read Goncho Honcho Memory for the deep-dive.
Q8: Are you doing gateway/channel work?
Section titled “Q8: Are you doing gateway/channel work?”Read Core Systems: Gateway first. Then read the relevant Gateway Donor Map dossier.
Q9: Are you doing tool/security work?
Section titled “Q9: Are you doing tool/security work?”Read Core Systems: Tool Execution first. Then read Must-Have Features §8.
Execution Horizons
Section titled “Execution Horizons”These horizons are derived from the Must-Have Features gap analysis and mapped to progress.json phases/subphases.
Horizon 1: Safety + Provider Completion (Next 30 Days)
Section titled “Horizon 1: Safety + Provider Completion (Next 30 Days)”Goal: Close the two biggest blockers to a safe, Python-free agent turn.
| Week | Target | progress.json Rows | Lane | Why |
|---|---|---|---|---|
| 1-2 | Complete xAI/Grok, LM Studio, DeepSeek/Kimi providers | 4.A (Bedrock runtime binding, Gemini, OpenRouter, Google Code Assist, Codex) | Lane 1 | Python-free turn requires all major providers |
| 1-2 | Tool descriptor layer (OperationSpec with trust classes) | 5.A (tool descriptor, toolsets) | Lane 3 | Every tool must declare who can call it |
| 2-3 | Prompt builder assembly closeout (skills snapshot, memory guidance) | 4.C (system+memory+tools+history, toolset-aware skills) | Lane 1 | Complete the prompt assembly pipeline |
| 3-4 | Shell blocklist + filesystem scoping | 5.J (dangerous action gating, Tirith/path/URL policy) | Lane 3 | Critical safety gap |
| 3-4 | Permission approval UX (inline y/n/always) | 5.J (approval workflow) | Lane 3 | Critical safety gap |
| 3-4 | Trust-class enforcement in shared tool executor | 5.A (tool descriptor enforcement) | Lane 3 | Critical safety gap |
Exit criterion: gormes doctor reports zero trust-class violations in default registry. All major providers have Go adapters. First Hermes-compatible normal turn runs without Python.
Horizon 2: Production Hardening (Next 90 Days)
Section titled “Horizon 2: Production Hardening (Next 90 Days)”Goal: Make Gormes safe and reliable for real-world operation.
| Week | Target | progress.json Rows | Lane | Why |
|---|---|---|---|---|
| 5-6 | Context compression complete | 4.B (long session management, manual feedback, kernel callback binding) | Lane 1 | Long sessions degrade without compression |
| 5-6 | Loop detection (5 types) | 5.J (loop detection) | Lane 3 | Runaway loops are a real production problem |
| 7-8 | Token budget system + auto-concise | 5.N (token accounting) | Lane 5 | Cost control for production deployments |
| 7-8 | Docker sandbox backend | 5.B (Docker backend) | Lane 3 | Sandboxed execution for untrusted code |
| 9-10 | Browser daemon lifecycle + doctor | 5.C (browser daemon, profile, doctor) | Lane 3 | Browser tools need production lifecycle |
| 9-10 | Code execution mode policy | 5.R (execution-mode resolver) | Lane 3 | Safe defaults for code execution |
| 11-12 | CLI closeout (backup, logs, diagnostics) | 5.O (backup, logs, diagnostics CLI) | Lane 5 | Operator needs visibility and recovery |
| 11-12 | Packaging closeout (install.sh, install.ps1, install.cmd) | 5.P (Unix/Windows installers) | Lane 5 | Frictionless installation |
Exit criterion: A new user can curl install.sh | bash, run gormes doctor, see green checks for all configured providers, and start a safe normal turn. Loop detection fires on runaway sessions. Token budget prevents surprise bills.
Horizon 3: Memory Differentiation (Next 6 Months)
Section titled “Horizon 3: Memory Differentiation (Next 6 Months)”Goal: Make Goncho meaningfully better than session-based memory.
| Month | Target | progress.json Rows | Lane | Why |
|---|---|---|---|---|
| 3 | Typed memory categories + confidence scoring | 6.C (skill storage), 6.D (retrieval) | Lane 2 | Structured memory is major UX improvement |
| 3 | Zero-LLM knowledge graph wiring | 6.D (source-aware retrieval) | Lane 2 | Reduces LLM calls for entity resolution |
| 4 | Brain-first lookup (5-step before external API) | 6.D (retrieval eval) | Lane 2 | Significantly reduces LLM calls |
| 4 | Retrieval eval harness (precision@k, recall@k) | 6.D (retrieval eval) | Lane 2 | Turns “memory feels better” into testable contract |
| 5 | Metadata-driven skill placement | 6.C (portable SKILL.md format) | Lane 6 | More granular skill activation control |
| 5 | Soul/personality system (soul.md, persona.md, taste.md, heartbeat.md) | 6.A (complexity detector), 6.B (skill extractor) | Lane 6 | Planned for Phase 6 |
| 6 | Channel adapters (Matrix, Mattermost, LINE, IRC) | 7.C, 7.E | Lane 4 | Complete the long-tail channel surface |
Exit criterion: Goncho recall quality is measurably better than Hermes’s default memory. Retrieval eval harness runs on every memory change. Skills activate with context-aware metadata. Personality files are operator-editable markdown.
Horizon 4: Capstone Features (Next 12 Months)
Section titled “Horizon 4: Capstone Features (Next 12 Months)”Goal: Features that make Gormes uniquely valuable beyond Hermes parity.
| Month | Target | progress.json Rows | Lane | Why |
|---|---|---|---|---|
| 7-8 | Learning loop (skill extraction, feedback, scoring) | 6.A-6.F | Lane 6 | The feature Hermes doesn’t have |
| 8-9 | Web dashboard (TypeScript/React) | 5.Q (API server + TUI gateway) | Lane 5 | Hermes has 191K-line TUI gateway |
| 9-10 | Code Cathedral II (call-graph edges, two-pass retrieval) | 6.D (Code Cathedral II) | Lane 6 | Code-aware agent capabilities |
| 10-11 | Multi-memory backends (Turbopuffer, LanceDB, Redis) | 3.* (future memory work) | Lane 2 | Scale beyond single-node SQLite |
| 11-12 | Three-agent memory loop (Deriver/Dialectic/Dreamer) | 3.* (future memory work) | Lane 2 | Honcho’s unique memory paradigm |
| 11-12 | Mixture of agents (multi-model coordination) | 5.M | Lane 3 | Agent ensemble capabilities |
Exit criterion: Gormes is not just “Hermes in Go” — it is a demonstrably better agent runtime with compounding skills, proven memory quality, and operator-visible intelligence.
Risk Register
Section titled “Risk Register”Features or dependencies that could derail the plan:
| Risk | Impact | Mitigation | Owner |
|---|---|---|---|
| Security gaps persist (no shell blocklist, no filesystem scoping) | Critical — unsafe to operate | Horizon 1 priority #1 | Lane 3 |
| Provider parity stalls (Bedrock, Gemini, OpenRouter gaps) | High — Python still required | Weekly provider-audit pass | Lane 1 |
| Context compression never completes | High — long sessions degrade | Tight scope: only manual feedback + model-switch recalc | Lane 1 |
| Loop detection missing | High — production runaway loops | Port Mercury’s 200-line TypeScript detector | Lane 3 |
| progress.json drifts from reality | Medium — wrong work gets built | go run ./cmd/progress validate on every PR | Lane 0 |
| Channel expansion outruns core agent | Medium — shallow adapters | Phase 7 rule: build only fixture-ready slices | Lane 4 |
| Learning loop scope creep | Medium — never ships | Hard gate: only after skill storage + resolver evals are reliable | Lane 6 |
| Memory backend abstraction too early | Low — SQLite-first promise broken | Keep Postgres behind interface; default remains SQLite | Lane 2 |
Weekly Cadence (Recommended)
Section titled “Weekly Cadence (Recommended)”Monday: Review Agent Queue and Blocked Slices. Pick 1-2 rows.
Tuesday-Thursday: Build rows using gormes-builder + gormes-tdd-slice. Run go test ./... -count=1 and go run ./cmd/progress validate before claiming done.
Friday: Review done signals. Update progress.json evidence. If queue is empty, run gormes-planner pass to sharpen planned rows.
End of Month: Run gormes-parity-auditor pass against one Hermes/Honcho subsystem. Update Cross-Project Feature Map if gaps changed.
Success Metrics
Section titled “Success Metrics”| Horizon | Date Target | Metric | Current |
|---|---|---|---|
| H1: Safety + Providers | May 30, 2026 | Provider parity >90%, zero trust-class violations | ~70% parity, 0 security hardening |
| H2: Production Harding | Jul 30, 2026 | gormes doctor all-green, loop detection shipped, token budget active | doctor partial, no loop detection, no budget |
| H3: Memory Differentiation | Oct 30, 2026 | Retrieval eval harness running, typed memories shipped, 15+ channels | no eval harness, no typed memories, 10 channels |
| H4: Capstone Features | Apr 30, 2027 | Learning loop extracting skills, web dashboard live, multi-model coordination | none started |
| Final: Hermes in Go | Apr 30, 2027 | 80%+ feature parity, all foundational + production gaps closed, differentiators shipping | ~30-40% parity |
Quick Reference: skill → Document Mapping
Section titled “Quick Reference: skill → Document Mapping”| Skill | Primary Document | Secondary Documents |
|---|---|---|
gormes-skill-manager | Skill Builder Handoff | Contract Readiness |
gormes-planner | Completion Plan | Lane Roadmap, Must-Have Features |
gormes-builder | Agent Queue | Next Slices, Contract Readiness |
gormes-tdd-slice | Testing | Porting a Subsystem |
gormes-parity-auditor | Cross-Project Feature Map | Hermes And Honcho Feature Map, Upstream Coverage Ledger |
gormes-interface-designer | Go Donor Reference Map | Core Systems |
gormes-provider-parity | GO-HERMES-PORTS-FORKS.md | Upstream Lessons |
gormes-browser-harness | Gateway Donor Map | Core Systems: Gateway |
gormes-dev-runtime | Start here | Why Gormes |
gormes-references | Go Donor Reference Map | references/go-agent-os/ |
gormes-readme | README.md | Why Gormes |
gormes-landing-web | www.gormes.ai/ | Why Gormes |
Generated: April 30, 2026 Source: progress.json v2.0, must-have-features.md, lane-roadmap.md, completion-plan.md Update rule: Refresh this document when progress.json major state changes or when must-have-features.md is updated.