System Design — agent-mesh + mem7 + agent7
May 2026. Living document.
The stack in one sentence
agent-mesh enforces what agents can do. mem7 remembers what happened. agent7 shows it all and lets humans intervene.
Design principles
- Each project works alone. agent-mesh without mem7 still enforces policies. mem7 without agent-mesh is a standalone memory server. agent7 without the others is a governance dashboard. Integration is opt-in, not required.
- Open-source runtime, product on top. agent-mesh and mem7 are MIT/Apache — free, adoptable, no lock-in. agent7 is the management plane that makes them manageable at scale. That's where the business model sits.
- Decisions are facts. When a human approves an action or a supervisor auto-resolves a request, that decision is stored as a queryable fact in mem7. It doesn't vanish into a log file.
- Write path matters more than read path. The system becomes valuable when events automatically flow between components — not when a human manually checks dashboards.
Components
agent-mesh (runtime / data plane)
Go binary. Sidecar proxy between agents and their tools.
| What it does | How |
|---|---|
| Policy enforcement | YAML rules: allow, deny, human_approval per tool per agent |
| Approval queue | Pending requests, resolve via API or Claude Code prompt |
| Rate limiting + loop detection | Per-agent, per-tool, configurable |
| Tracing | JSONL trace files, OTEL export, session tracking |
| Grants | Temporary sudo-like bypass for specific agents |
| Supervisor protocol | External process polls pending approvals, auto-resolves routine ones |
Transports: MCP stdio (Claude Code, Cursor) · MCP Streamable HTTP at POST /mcp (Anthropic Managed Agents, remote clients) · HTTP REST (POST /tool/{name})
Current state: v0.9.0, stable, docs polished.
mem7 (memory substrate)
Go binary. MCP server for persistent, searchable, governed memory.
| What it does | How |
|---|---|
| Store/recall/search/forget | 7 MCP tools, markdown source of truth + SQLite FTS5 index |
| Hybrid search | BM25 + dense cosine (Ollama/OpenAI) + LLM reranking |
| Structured recall | memory_context returns JSON for SDK consumption |
| Tag-scoped access | Any agent reads/writes its own observations via tags |
| Temporal range queries | since / until filters on RFC3339 timestamps |
| Python SDK | pip install mem7, provider-agnostic, wraps all tools via HTTP |
Current state: v0.4.1, 71% LoCoMo benchmark, SDK + SSE transport + daemon mode shipped.
agent7 (management plane / product)
Next.js 16 + FastAPI + PostgreSQL. Dashboard, governance engine, supervisor UI.
| What it does | How |
|---|---|
| Agent catalog | Registry of declared agents with metadata, scoring, lifecycle |
| Governance engine | Rules, scoring (3-axis), severity escalation, validation verdicts |
| Trace viewer | Ingests agent-mesh JSONL, aggregates stats, detects ghosts/orphans |
| Memory viewer | Reads mem7 via SDK, displays stored facts and decisions |
| Human approval UI | Shows pending approvals from agent-mesh, human clicks approve/reject |
| Dependency graph | Declared (YAML) + inferred (traces), impact analysis |
| Diff engine | Breaking change detection on agent config changes |
Current state: Early stage, dashboard + traces + sessions + memory debug view working. No version tag yet.
The integrated flow
Today: tool call with human approval
Developer → Claude Code → agent-mesh → gmail.send_email
│
policy: human_approval
│
Claude Code shows permission prompt
│
Developer says "yes"
│
agent-mesh resolves → email sent
agent-mesh writes trace → JSONL
│
Decision is gone. Next time, same question.
Target: tool call with governed memory
Developer → Claude Code → agent-mesh → gmail.send_email
│
policy: human_approval
│
┌──────────────────┼──────────────────┐
│ │ │
Supervisor (L1) Human (L2) agent7 UI
in agent-mesh via Claude Code shows pending
│ or agent7 UI │
│ │ │
checks mem7: │ │
"has Marc approved │ │
gmail.send before?" │ │
│ │ │
if yes (3+ times) ───► auto-approve │
if no ────────────► escalate to human ◄───────┘
│
human says "yes"
│
agent-mesh resolves → email sent
agent-mesh writes to mem7:
key: "decision.gmail.send_email.20260507"
value: "approved by Marc, recipient: X, subject: Y"
tags: [decision, approval, gmail]
agent: "agent-mesh"
│
agent7 displays decision in audit trail
What changes
| Step | Before | After |
|---|---|---|
| Approval source | Always human | Supervisor checks mem7 first, escalates if unsure |
| Decision storage | Lost in terminal history | Stored as fact in mem7, queryable |
| Learning | None — same prompt every time | Supervisor learns from past decisions |
| Audit | Grep JSONL logs | agent7 dashboard, filterable, searchable |
| Visibility | Only the developer who approved | Any team member via agent7 |
Data flow between components
agent7 (management plane)
┌────────────────────────────┐
│ dashboard, approval UI, │
│ governance, audit trail │
└──────┬──────────┬──────────┘
│ │
reads via │ │ reads via
HTTP API │ │ Python SDK
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ agent-mesh │───►│ mem7 │
│ (runtime) │ │ (memory) │
│ │ │ │
│ • traces JSONL │ │ • facts │
│ • approvals │ │ • decisions │
│ • policies │ │ • observations │
└────────┬────────┘ └─────────────────┘
│ ▲
│ writes decisions │
└─────────────────────┘
(opt-in, if mem7 configured)
Write paths (who writes what where)
| Writer | Target | What | When |
|---|---|---|---|
| Any agent | mem7 | Observations, facts, context | During execution, via MCP store |
| agent-mesh | mem7 | Approval/rejection decisions | On approval resolve (opt-in) |
| agent-mesh | JSONL | Tool call traces | Every tool call (always) |
| Supervisor | mem7 | Auto-resolve rationale | On auto-approve (opt-in) |
| Human via agent7 | agent-mesh API | Approve/reject action | Clicking in UI |
| agent7 | PostgreSQL | Aggregated stats, governance scores | On trace ingestion, on sync |
Read paths (who reads what from where)
| Reader | Source | What | When |
|---|---|---|---|
| Supervisor | mem7 | Past decisions for same pattern | Before deciding to auto-approve |
| Any agent | mem7 | Stored facts, context, history | During execution, via MCP search |
| agent7 | agent-mesh API | Pending approvals | Polling for approval UI |
| agent7 | agent-mesh JSONL | Trace history | On ingestion (CLI or API push) |
| agent7 | mem7 (SDK) | Stored decisions, facts | For memory viewer, audit trail |
| agent7 | PostgreSQL | Scores, rules, lifecycle | For governance dashboard |
Approval architecture (detailed)
Trust hierarchy
Level 0: Policy engine (agent-mesh)
Static rules. Instant. No judgment.
allow reads, deny deletes, require approval for sends.
Level 1: Built-in supervisor (in agent-mesh, Go)
mem7 lookup. ~100ms. Pattern matching only.
Checks past decisions: 3+ approvals, 0 rejections → auto-approve.
Escalates unknowns to Level 1+.
Level 1+: External supervisor (in agent7, Python)
Rule engine + Ollama LLM. ~2s rules, ~20s LLM. Bounded judgment.
Handles novel cases, complex conditions, injection detection.
Escalates unknowns to Level 2.
Level 2: Human
Full judgment. Slow. Expensive attention.
Sees pending in Claude Code prompt OR agent7 UI.
Decision written back to mem7 via agent-mesh.
Two supervisor layers. Level 1 is built into agent-mesh — a
MemoryReaderthat queries mem7 before the approval queue. It's a pre-filter, not a full resolver. Level 1+ is the external Python supervisor in agent7 that implements the supervisor protocol — it polls pending approvals and resolves them with rule evaluation + LLM. Thesupervisor/package inside agent-mesh handles content redaction and injection detection on the protocol's outbound payloads (RedactParams,DetectInjection) — a separate concern from both layers.
Supervisor decision logic (pseudocode)
def evaluate(request, mem7_client):
# Check mem7 for similar past decisions
past = mem7_client.context(
f"{request.tool} {request.agent}",
tags=["decision"],
limit=10
)
approved_count = sum(1 for m in past if "approved" in m.value)
rejected_count = sum(1 for m in past if "rejected" in m.value)
# Pattern: consistently approved → auto-approve
if approved_count >= 3 and rejected_count == 0:
return AutoApprove(reason=f"approved {approved_count} times before")
# Pattern: recently rejected → auto-deny
recent_reject = [m for m in past if "rejected" in m.value and is_recent(m, days=7)]
if recent_reject:
return AutoDeny(reason="rejected recently, escalate")
# Unknown pattern → escalate to human
return Escalate(reason="no clear precedent")
Where the approval UI lives
Claude Code terminal — the developer gets the prompt inline. This is the current flow, works for solo use.
agent7 web UI — for team use, overnight runs, or when multiple agents generate approvals faster than one human can handle. agent7 polls agent-mesh's pending queue, displays context, human clicks. agent7 POSTs back to agent-mesh /approval/resolve.
Both are thin clients of the agent-mesh approval API. If agent7 goes down, Claude Code still works. If both go down, requests queue in agent-mesh until someone resolves them (fail-safe, not fail-open).
Deployment
Solo developer (current setup)
laptop
├── agent-mesh (Go binary, sidecar)
│ ├── config.yaml (policies)
│ └── traces/ (JSONL)
├── mem7 (Go binary, MCP stdio via agent-mesh)
│ └── ~/.mem7/ (markdown + SQLite)
└── Claude Code / Cursor (agent)
No agent7 needed. agent-mesh + mem7 provide full governance + memory. The developer is the human-in-the-loop via terminal prompts.
Small team
shared server
├── agent-mesh (central, HTTP mode)
├── mem7 serve (HTTP, shared memory)
└── agent7 (dashboard + supervisor)
└── PostgreSQL
developer laptops
└── agents connect to shared agent-mesh
agent7 adds value: team visibility, approval UI for shared agents, governance scoring, audit trail.
Enterprise (future)
agent7 SaaS (hosted)
├── governance engine
├── approval UI
├── audit + compliance
└── policy push → customer agent-mesh
customer infra
├── agent-mesh (sidecar per agent)
├── mem7 (per-team or central)
└── agents (any framework)
Anthropic Managed Agents (cloud harness + governed tools)
Anthropic cloud (harness, sandbox, multi-agent orchestration)
├── Managed Agent coordinator (Opus)
│ ├── sub-agents (reviewer, tester, security)
│ └── mcp_toolset → MCP Streamable HTTP
│
└── POST /mcp ──────────────────────────────►
your infra
├── agent-mesh (policies, approval, traces)
│ └── POST /rpc ──► mem7 (decisions)
└── upstream MCP servers (ollama, arch7, ...)
What Anthropic provides: cloud sandbox, session durability, multi-agent coordination (threads), outcome grading (rubrics), vault credential management.
What agent-mesh adds: fine-grained deny policies (Managed Agents only have allow/ask), per-agent rate limiting, temporal grants, decision persistence to mem7, OTEL traces, governed memory cross-session.
Integration point: agent-mesh POST /mcp endpoint serves MCP Streamable HTTP. Managed Agent's MCP connector discovers tools via tools/list, calls them via tools/call. Policies apply transparently. Vault injects Authorization: Bearer agent:<id> for per-agent policy evaluation.
agent7 role: same as for any deployment — dashboard, approval UI, governance scoring, audit trail. The Managed Agent sessions generate traces and decisions that flow to agent7 like any other agent.
Implementation priorities
Phase 1: Decision write path (agent-mesh → mem7) ✓
Shipped in agent-mesh v0.8.7.
When an approval resolves in agent-mesh, the decision is stored in mem7 if configured.
- Config field:
memory.url+ optionalmemory.token - On resolve: async POST to mem7
/rpcwithmemory_store - Key format:
decision.<tool>.<approval_id> - Tags:
[decision, approved|denied, <tool>, agent:<id>] - Value: human-readable summary — "approved by X — agent:Y tool:Z reason:..."
- Metrics:
agent_mesh_mem7_writes_{attempted,succeeded,failed}_total - Graceful degradation: failing mem7 never blocks approvals
Phase 2: Built-in supervisor reads mem7 ✓
Shipped in agent-mesh v0.9.1.
agent-mesh queries mem7 before submitting to the approval queue. This is the built-in Level 1 supervisor — a pre-filter that handles routine patterns.
MemoryReaderqueries mem7memory_searchwith tool name + agent + tags=["decision"]- Counts past approvals/rejections from search results
- Auto-approve if >=
min_approvals(default 3) with 0 rejections - Escalate if ambiguous, rejected, or mem7 is down
- Auto-approved decisions traced as
supervisor:mem7and written back to mem7 - Config:
supervisor.auto_approve(default true),supervisor.min_approvals(default 3)
Complements the external Python supervisor (in agent7): the built-in handles routine patterns (~100ms); the external supervisor handles novel cases with rule engine + Ollama LLM evaluation (~20s). Both escalate unknowns to humans.
Phase 3: agent7 reads mem7 via SDK
Replace the current ad-hoc memory debug view with proper SDK integration.
agent7 changes:
- pip install mem7 in backend dependencies
- Backend service: Mem7Client wrapping the SDK, configured via env var
- API routes: /api/v1/memory/search, /api/v1/memory/decisions
- Frontend: memory viewer page, decision audit trail with filters
Estimated effort: ~500 lines Python + frontend.
Phase 4: agent7 approval UI as thin client
agent7 displays pending approvals from agent-mesh and lets humans resolve them.
agent7 changes:
- Poll agent-mesh /approval/pending (API already exists)
- Display: tool call details, agent identity, past decisions from mem7 (context)
- Action: approve/reject button → POST agent-mesh /approval/resolve
- agent-mesh handles the mem7 write (phase 1), agent7 doesn't write to mem7 directly
Estimated effort: ~400 lines Python + frontend.
What this enables (end state)
-
Adaptive governance. Policies start strict (human approval for everything). As the system collects approved patterns in mem7, the supervisor auto-approves routine actions. Governance gets less intrusive over time without getting less safe.
-
Cross-agent memory. Agent A stores an observation. Agent B searches and finds it. The supervisor checks if a decision was already made. All through the same mem7 store, scoped by tags.
-
Audit without effort. Every decision is a fact in mem7. Every tool call is a trace in agent-mesh. agent7 joins them: "this agent called this tool, it was approved because of this past decision, here's the full chain." Compliance teams query, they don't grep.
-
Progressive adoption. Day 1: install agent-mesh, get policies and tracing. Day 30: add mem7, get persistent memory. Day 60: add agent7, get visibility and team governance. Each step is independently valuable.
Three projects, one system. Independent by default, powerful together.