agent_gtk
The Linux/BSD GUI backend for the agent surface: it binds
the framework-neutral agent_core rules to a live
GTK 4 window. It is the sibling of agent_appkit
(macOS) — agent_core is shared unchanged, and only this thin bridge is
GTK-specific.
open(window)walks the liveGtkWidgettree into aSurface— the controllable model an agent sees. The walk is a DFS over the GTK 4 child chain (gtk_widget_get_first_child/gtk_widget_get_next_sibling), classifying each widget by its GObject type into anagent_core::Role.Surface::describereturns a live snapshot of the exposed nodes (Vec[UiNode]). A node is exposed by tagging its widget withset_agent_id; untagged widgets are still walked for tree completeness but are not actionable.- Authorized actions —
click,set_text, andfocusrun through theagent_coreauthorization brain. The real I/O (gtk_widget_activate,gtk_editable_set_text,gtk_widget_grab_focus) only runs onAllowed. Text edits use optimistic-concurrency versioning, so a stale write is rejected. - Events —
Surface::emittranslates a fired widget (an app-installed GObject signal handler) into anagent_coreverb offered to aSubscriber.
Curating the surface
Only widgets tagged with set_agent_id are exposed. The id is held as GObject
data (g_object_set_data does not copy), so pass a stable NUL-terminated string
literal.
import "agent_gtk/agent_gtk" as agent;
// Tag the widgets the agent may see and act on.
agent::set_agent_id(button.raw(), #str_ptr("btn_login\0"));
agent::set_agent_id(entry.raw(), #str_ptr("user_field\0"));
Read, then act
open snapshots the window; describe reads each node's current frame, text,
and enabled state, so it reflects state now — including after a set_text.
Each write resolves the agent-id, asks agent_core first, and returns a
surface::Outcome (Allowed / NotFound / NotExposed / NotActionable /
VersionConflict).
import "agent_gtk/agent_gtk" as agent;
import "agent_core/surface" as surface;
let surf: agent::Surface = agent::open(window.raw());
// READ — the curated UiNode list.
let nodes: agent::Vec[agent::UiNode] = surf.describe();
// WRITE — authorized through agent_core.
let _ = surf.click("btn_login");
// Optimistic concurrency: read a version, then write against it.
let v = surf.text_version("user_field");
let oc = surf.set_text("user_field", "alice", v);
if surface::outcome_eq(oc, surface::Outcome::VersionConflict) {
// a racing edit landed first — re-read and retry.
}
A UiNode carries { id, role, class_name, frame, is_hidden, text, actionable, parent }. frame is a Rect of f32 fields (GTK lays out in graphene floats,
so the coordinates stay faithful, no lossy cast), and parent indexes back into
the returned list so the flat snapshot reconstructs the tree.
Platform notes
- GTK 4 only. Roles are decided with
g_type_check_instance_is_a, which is ancestry-aware — aGtkToggleButtonanswers aGtkButtonquery, anyGtkEditableis anInput— the GTK analogue of AppKit'sisKindOfClass:. - Single-threaded by contract. Unlike the AppKit backend, there is no main-thread marshaling helper: GTK is not thread-safe, so an app that drives the surface off the GTK main thread must hop threads itself — the same rule all GTK code lives under.
- Links the GTK 4 stack (
gtk-4,gobject-2.0,gio-2.0,glib-2.0). On Debian/Ubuntu installlibgtk-4-dev; pango/cairo/graphene resolve transitively. See Targets for cross-build details.
agent_gtk also exposes a backend-neutral mcp_backend() vtable, the seam an
MCP bridge plugs into to serve this surface to an external agent — see
agent_mcp. For the shared rules underneath both
backends, see agent_core.