Systems · View as Markdown

Agent surface

C+ apps can be built to be driven by an agent, not just by a person. An app exposes a controllable surface; an external agent can then describe it (what is here?), act on it (click this, set that text), and observe it (what changed?) — all through a consent gate the app owns. Nothing is reachable that the app did not deliberately expose.

This is the systems counterpart to the rest of C+: where the manual is written so a model can read the code, the agent surface lets a model operate the running app, under explicit authorization.

For a concrete proof recipe, see AppKit agent surface: the checked docs/examples/recipes/appkit_agent recipe from the C+ source tree. It exposes a native AppKit window through agent_appkit and agent_mcp, with stable agent ids, curated describe_ui output, authorized actions, stale text rejection, and consent refusal.

The three pieces

Package	Role
agent_core	The framework-agnostic authorization brain. Headless and fully tested.
agent_appkit	The macOS GUI backend: turns a live AppKit window into a controllable surface.
agent_mcp	The bridge: exposes the surface to an external agent over JSON-RPC 2.0 / MCP.

agent_core holds the rules and never touches a UI framework; agent_appkit binds those rules to a real NSView tree; agent_mcp carries them over the wire. Swapping in another UI backend later means reimplementing only the middle layer.

describe → act → observe

Describe. agent_appkit's open(window) walks the live NSView tree into a Surface, and describe_ui returns a snapshot (Vec[UiNode]) of just the nodes the app exposed. Each exposed node carries a build-time-stable agent id (set_agent_id), so an agent can refer to "the same button" across snapshots.
Act. Authorized click / set_text / scroll_to run through the agent_core authorization brain. Text edits use optimistic-concurrency versioning, so a stale edit is rejected rather than clobbering a newer value.
Observe. App notifications are translated into verbs and delivered as bubbling events; an agent subscribes by {node, verb, role}.

Consent, not capability

Authorization is all-or-none and app-controlled: an AuthGate consent check guards every request, and an affordance ceiling bounds what an exposed node will ever permit, so exposure can never escalate past what the app intended. agent_mcp speaks JSON-RPC 2.0 (describe_ui / actions / events) over Unix-domain sockets (serve_uds / serve_fd), with that gate in front of every call.

The appkit_agent recipe in the compiler repo shows the whole flow end to end.