Agent surface
C+ apps can be built to be driven by an agent, not just by a person. An app exposes a controllable surface; an external agent can then describe it (what is here?), act on it (click this, set that text), and observe it (what changed?) — all through a consent gate the app owns. Nothing is reachable that the app did not deliberately expose.
This is the systems counterpart to the rest of C+: where the manual is written so a model can read the code, the agent surface lets a model operate the running app, under explicit authorization.
For a concrete proof recipe, see AppKit agent surface: the checked docs/examples/recipes/appkit_agent recipe from the C+ source tree. It exposes a native AppKit window through agent_appkit and agent_mcp, with stable agent ids, curated describe_ui output, authorized actions, stale text rejection, and consent refusal.
The three pieces
| Package | Role |
|---|---|
| agent_core | The framework-agnostic authorization brain. Headless and fully tested. |
| agent_appkit | The macOS GUI backend: turns a live AppKit window into a controllable surface. |
| agent_mcp | The bridge: exposes the surface to an external agent over JSON-RPC 2.0 / MCP. |
agent_core holds the rules and never touches a UI framework; agent_appkit
binds those rules to a real NSView tree; agent_mcp carries them over the wire.
Swapping in another UI backend later means reimplementing only the middle layer.
describe → act → observe
- Describe.
agent_appkit'sopen(window)walks the live NSView tree into aSurface, anddescribe_uireturns a snapshot (Vec[UiNode]) of just the nodes the app exposed. Each exposed node carries a build-time-stable agent id (set_agent_id), so an agent can refer to "the same button" across snapshots. - Act. Authorized
click/set_text/scroll_torun through theagent_coreauthorization brain. Text edits use optimistic-concurrency versioning, so a stale edit is rejected rather than clobbering a newer value. - Observe. App notifications are translated into verbs and delivered as
bubbling events; an agent subscribes by
{node, verb, role}.
Consent, not capability
Authorization is all-or-none and app-controlled: an AuthGate consent check
guards every request, and an affordance ceiling bounds what an exposed
node will ever permit, so exposure can never escalate past what the app
intended. agent_mcp speaks JSON-RPC 2.0 (describe_ui / actions / events)
over Unix-domain sockets (serve_uds / serve_fd), with that gate in front of
every call.
The appkit_agent recipe in the compiler repo shows the whole flow end to end.