docs(plan5): companion chat design spec
Scope B (knowledge assistant + drafting via pending_changes approval chain), lean Anthropic-SDK runtime (supersedes the top-level spec's Mastra wording), extensible shared tool registry (search/read/propose_change/context), per-Space ambient companion, SSE turn lifecycle, inline draft card synced with the Inbox, structural prompt-injection containment. Ignore .superpowers/ brainstorm dir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,178 @@
|
||||
# Void 2.0 — Plan 5: Companion Chat (Design Spec)
|
||||
|
||||
**Date:** 2026-06-01
|
||||
**Status:** Approved for planning
|
||||
**Builds on:** Plans 1–4 (foundation, API + shell, capture, workers). Version baseline `2.0.0-alpha.4`.
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
Turn the right-rail stub (`public/components/rightrail.js`) into a working, always-visible
|
||||
**companion chat**: an AI assistant scoped to the current Space that can
|
||||
|
||||
1. answer questions about the Void's knowledge (search + read), and
|
||||
2. **propose changes** (create task, edit page, add ref, …) that flow through the
|
||||
existing `pending_changes` approval chain — never direct writes.
|
||||
|
||||
This is **scope B** of the agent model: knowledge assistant **+** drafting through approval.
|
||||
It is deliberately a vertical slice that exercises the locked agent model end-to-end
|
||||
(suggest tier → `pending_changes` → audit) while leaving multi-persona/local-model work
|
||||
(**scope C**) as a clean follow-on.
|
||||
|
||||
### Spec reconciliation
|
||||
The top-level design spec (`2026-05-31-void-v2-design.md`) describes the agent runtime as
|
||||
*"Claude subprocess + Ollama via Mastra."* That is **superseded** by this plan: the Plan 5
|
||||
runtime is a **lean runtime built directly on the Anthropic SDK — no Mastra, no subprocess.**
|
||||
The per-agent `model` field preserves the option to route specific agents to local Ollama
|
||||
models later (scope C).
|
||||
|
||||
## 2. Scope
|
||||
|
||||
### In scope (v1)
|
||||
- Lean agent runtime on the Anthropic SDK with a tool-use loop and token streaming.
|
||||
- A shared, extensible tool registry with four tools: `search`, `read`, `propose_change`, `context`.
|
||||
- Chat API: persist user message + SSE streaming response endpoint, over the existing
|
||||
`conversations`/`messages` repos.
|
||||
- One **ambient companion conversation per Space**; current view passed as ephemeral context.
|
||||
- Right-rail chat UI: the approved turn rendering, live tool-activity chips, streamed answer,
|
||||
inline draft card (a second view onto a `pending_changes` row), collapse-to-tab + drag-resize.
|
||||
- Anthropic API key resolved via the existing `vault_path` env/file resolver.
|
||||
- Tests: tool handlers, runtime loop (mocked Anthropic client), SSE/API integration, security.
|
||||
|
||||
### Out of scope (deferred)
|
||||
- **Scope C — multi-agent personas** and per-persona local (Ollama) models.
|
||||
- The **MCP server** surface (the tool registry is designed so MCP re-exposes it later).
|
||||
- First-class, attachable **`Conversation` entities** bound to a Task/Project/Resource.
|
||||
- **Vaultwarden** secrets swap (tracked separately; env/file stopgap is intentional here).
|
||||
- Adding tools beyond the four (the registry makes this a drop-in later).
|
||||
|
||||
## 3. Architecture
|
||||
|
||||
```
|
||||
Browser (right rail)
|
||||
│ POST /api/conversations/:id/messages (persist user turn)
|
||||
│ GET /api/conversations/:id/stream (SSE) (run turn, stream tokens + tool events)
|
||||
▼
|
||||
void-server (Node / Express)
|
||||
├─ lib/api/routes/chat.js SSE endpoint + message persistence
|
||||
├─ lib/ai/agent/runtime.js Anthropic SDK tool-use loop, streaming
|
||||
├─ lib/ai/agent/registry.js shared {name, schema, handler} tool registry
|
||||
├─ lib/ai/agent/tools/*.js search · read · propose_change · context
|
||||
└─ existing: repos (conversations, messages, pending_changes, agents),
|
||||
capability check, audit log, hybrid search, vault_path resolver
|
||||
▼
|
||||
Anthropic API (default) · Postgres (CT 310) · Ollama (embeddings/search, existing)
|
||||
```
|
||||
|
||||
### 3.1 Agent runtime (`lib/ai/agent/`)
|
||||
- **Lean loop, Anthropic SDK directly.** Build the message list (system prompt + prior turns
|
||||
+ new user turn), call Claude with the registered tool schemas, execute any returned
|
||||
`tool_use` blocks via the registry, append `tool_result`s, and loop until Claude returns a
|
||||
final text answer. Stream text deltas to the caller as they arrive.
|
||||
- **Model resolution.** The model id comes from the agent record's `model` field; defaults to a
|
||||
current Claude model when unset. (This is the seam for scope-C local models.)
|
||||
- **API key.** Resolved through the existing `vault_path` resolver (`env:ANTHROPIC_API_KEY` or
|
||||
`file:/path`) — never hard-coded, never placed in the prompt.
|
||||
- **Cost guardrails.** Per-turn `max_tokens` cap; context assembled from *relevant* material
|
||||
(recent turns + tool results) rather than the whole Space; prompt caching applied at
|
||||
implementation time (use the `claude-api` skill).
|
||||
|
||||
### 3.2 Shared tool registry (`lib/ai/agent/registry.js`)
|
||||
Each tool is `{ name, description, input_schema (JSON Schema), handler(args, ctx) }`. The runtime
|
||||
consumes the registry now; a future MCP server re-exposes the same definitions verbatim. The
|
||||
registry is **extensible** — adding a tool is registering a new entry, no runtime changes.
|
||||
|
||||
`ctx` carries the acting agent, the current Space/view, and an actor for audit. Handlers enforce
|
||||
capability tier; this is the single place mutation policy lives.
|
||||
|
||||
**v1 tools**
|
||||
|
||||
| Tool | Purpose | Mutates? |
|
||||
|------|---------|----------|
|
||||
| `search` | Hybrid FTS+vector search (wraps existing `/api/search` logic), Space-scoped | no |
|
||||
| `read` | Fetch a page/ref/task/conversation by id for grounding | no |
|
||||
| `context` | Resolve the current view's entity (type + id → summary) | no |
|
||||
| `propose_change` | Emit a structured draft → one `pending_changes` row | **only via approval** |
|
||||
|
||||
`propose_change` **never applies** a change. It writes a `pending_changes` row (reusing
|
||||
`applyPendingChange`'s entity/op vocabulary) and returns its id. Application happens only on
|
||||
explicit user approval, through the existing dispatch + audit path.
|
||||
|
||||
### 3.3 Chat API (`lib/api/routes/chat.js`)
|
||||
- `POST /api/conversations/:id/messages` — persist the `user` message (role=`user`), validate the
|
||||
Space scope, return the message id.
|
||||
- `GET /api/conversations/:id/stream?messageId=…` — **SSE**. Runs the runtime for the latest user
|
||||
turn and emits events:
|
||||
- `tool` — `{ tool, args_summary, status }` for the activity chips
|
||||
- `delta` — streamed assistant text
|
||||
- `draft` — `{ pending_change_id, summary }` when `propose_change` fires
|
||||
- `done` — `{ assistant_message_id, usage }`
|
||||
- `error` — `{ message }`
|
||||
- On completion, persist **one** `assistant` message whose `metadata` holds the tool-call trace,
|
||||
any draft ids, the model id, and token usage. No `role=tool` rows.
|
||||
- Conversation resolution: one ambient conversation per Space (find-or-create by `space_id` +
|
||||
default agent). The current view (`{entityType, entityId}`) is passed per request and exposed to
|
||||
the agent via the `context` tool / system prompt — it is **not** persisted per message beyond the
|
||||
trace.
|
||||
|
||||
### 3.4 Right-rail UI (`public/components/rightrail.js`)
|
||||
- **Turn rendering (approved):** label-led (`YOU` / agent name) with conversational left/right
|
||||
alignment and a thin colored accent edge per side; tool steps render as dim, monospace,
|
||||
left-aligned ledger lines (the live activity chips).
|
||||
- **Streaming:** consume SSE; append `delta` text into the in-progress assistant turn; render
|
||||
`tool` events as chips as they arrive.
|
||||
- **Draft card:** on a `draft` event (or when rendering history with draft ids), show an inline
|
||||
approve/reject card bound to the `pending_changes` row. Approving/rejecting hits the existing
|
||||
pending-changes endpoint; the same row also appears in the Inbox view, and state stays in sync.
|
||||
- **Chrome:** collapse to a slim tab; drag-handle resize; per-Space default agent shown in the
|
||||
header.
|
||||
- **Safety:** all DOM via `dom.js` (`el()`/`mount()`/`safeHref()`); assistant markdown rendered
|
||||
only through the sanitized `html:` path (marked + DOMPurify). Never `innerHTML` from API data.
|
||||
|
||||
## 4. Security
|
||||
|
||||
- **Prompt-injection containment is structural, not heuristic.** The companion reads untrusted
|
||||
content (Karakeep imports, fetched URLs, PDFs/OCR). Defense: `propose_change` cannot auto-apply;
|
||||
tool handlers enforce the agent's capability tier (default `suggest` → `pending_changes`);
|
||||
read/search tools cannot mutate. Worst case from a poisoned document is a draft the owner must
|
||||
explicitly approve.
|
||||
- **Secrets** never enter the system prompt or tool output; the API key lives only in the
|
||||
resolver-backed config.
|
||||
- **Capability enforcement** is centralized in the registry handlers and reuses the existing
|
||||
capability check + audit emission (`actor_kind`, `agent_id`, diff).
|
||||
- **Cost/abuse:** per-turn token cap and bounded tool-call iterations (loop guard) to prevent
|
||||
runaway loops.
|
||||
|
||||
## 5. Error handling
|
||||
|
||||
| Failure | Behavior |
|
||||
|---------|----------|
|
||||
| Anthropic API error/timeout | `error` SSE event, clean message in rail, **no partial mutation**; user turn already persisted, turn retryable |
|
||||
| Tool handler error | failed-status `tool` chip; loop continues or ends gracefully with an explanatory answer |
|
||||
| SSE disconnect mid-turn | user message persisted; assistant turn can be re-requested |
|
||||
| Ollama down (search/embeddings) | existing FTS-only fallback in hybrid search |
|
||||
| Capability denied | tool returns a denial result; agent explains it can only suggest |
|
||||
|
||||
## 6. Testing
|
||||
|
||||
- **Tool handlers:** unit tests per tool; `propose_change` writes the correct `pending_changes`
|
||||
row and **never** applies; capability tier enforced (suggest-tier agent cannot direct-write).
|
||||
- **Runtime loop:** tests against a **mocked Anthropic client** returning deterministic
|
||||
`tool_use` / text fixtures — verify multi-step tool loops, loop guard, final message
|
||||
persistence + metadata shape.
|
||||
- **API/SSE:** integration test that posts a user message and consumes the SSE stream
|
||||
(`tool` → `delta` → `draft` → `done`), asserting persisted assistant message + draft row.
|
||||
- **Security:** injected-content test proving a "delete everything" instruction yields only a
|
||||
draft requiring approval.
|
||||
- **UI smoke:** per the existing matrix — render a turn, stream, approve a draft, confirm Inbox
|
||||
sync, collapse/resize.
|
||||
|
||||
## 7. Future upgrades (explicit hooks)
|
||||
|
||||
- **Scope C personas** — distinct Cradle personas with their own `model` (local Ollama via the
|
||||
same runtime seam) and capability scopes, switchable per view.
|
||||
- **More tools** — the extensible registry accepts new `{name, schema, handler}` entries with no
|
||||
runtime changes.
|
||||
- **MCP server** — re-exposes the shared tool registry to external agents (Open WebUI, OpenClaw).
|
||||
- **Vaultwarden** — swap the env/file `vault_path` for Vaultwarden item ids (pointer change only).
|
||||
- **First-class `Conversation` entities** — explicit threads attachable to Task/Project/Resource,
|
||||
alongside the ambient per-Space companion.
|
||||
Reference in New Issue
Block a user