Typed operations & engines: spine, 6 engines, plans, models, facades (#689)#690
Typed operations & engines: spine, 6 engines, plans, models, facades (#689)#690tony wants to merge 108 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #690 +/- ##
===========================================
+ Coverage 51.89% 74.22% +22.33%
===========================================
Files 25 214 +189
Lines 3623 12563 +8940
Branches 733 1671 +938
===========================================
+ Hits 1880 9325 +7445
- Misses 1439 2586 +1147
- Partials 304 652 +348 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
why: Record the experimental operations/engines layer for the upcoming release so the unreleased section tracks what landed. what: - Add a "What's new" deliverable under the unreleased 0.59.x section for the experimental operations and engines layer (#690) - Defer the release lead paragraph until the version is cut
Code reviewFound 2 issues:
libtmux/src/libtmux/experimental/ops/plan.py Lines 81 to 97 in e115eaf
libtmux/src/libtmux/experimental/ops/_ops/save_buffer.py Lines 38 to 41 in e115eaf 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Code reviewFound 1 issue:
libtmux/src/libtmux/experimental/ops/plan.py Lines 214 to 219 in 2e0b112 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code |
1 similar comment
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code |
why: Record the experimental operations/engines layer for the upcoming release so the unreleased section tracks what landed. what: - Add a "What's new" deliverable under the unreleased 0.59.x section for the experimental operations and engines layer (#690) - Defer the release lead paragraph until the version is cut
why: Operationalizes the typed-operations/engines architecture
(issues 688, 689) with the pure substrate that was absent from every
prototype branch: an inert, statically-typed operation value that
renders tmux commands, carries its result type, and serializes without
a live tmux server. Engines stay transport-agnostic over it. None of
this touches or changes existing public APIs.
what:
- Add libtmux.experimental.{ops,engines} packages (experimental, not
under the versioning policy)
- ops: frozen Operation[ResultT] with class-level metadata as the
single source of truth; pure render() with declarative version gating
(LooseVersion); build_result() adapting raw output to typed results
- ops: typed Result base + raise_for_status() (CPython/requests
precedent), SplitWindowResult/CapturePaneResult payloads
- ops: closed Target sum (PaneId/WindowId/SessionId/ClientName/NameRef/
IndexRef/Special/SlotRef) with fail-closed validation
- ops: fail-closed OperationRegistry keyed by kind, with OpSpec views
and predicate listing; stdlib dict serialization with round-trips
- ops: four seed operations (split-window, capture-pane, send-keys,
select-layout) registered via @register
- engines: TmuxEngine/AsyncTmuxEngine protocols, CommandRequest/
CommandResult, EngineSpec; run()/arun() execute bridge sharing one
render/build path (sync vs await is the only divergence)
- tests: 111 pure, fixture-parametrizable unit tests + doctests, all
runnable without a tmux server
why: Proves the operation/result contract is transport-agnostic -- the same typed result whether produced by a real tmux subprocess or an in-memory simulator -- and provides the offline engine that lets ops doctests and tests run without a tmux server (issue 689 phases 2-3). what: - engines.subprocess: classic SubprocessEngine mirroring tmux_cmd (has-session stderr fold, backslashreplace, trailing-blank strip; tmux failure returned as data, only missing binary raises), with for_server() deriving -L/-S/-f/-2 flags from a live Server - engines.concrete: deterministic in-memory engine (fabricated pane/ window/session ids, canned capture lines) for tests and docs - engines.registry: name-keyed engine registry (register/create/ available), seeded with subprocess + concrete - tests/experimental/contract: engine-agnostic operation contract run offline via concrete, plus classic-vs-concrete parity against a real tmux server (same result type + argv, payload may differ)
why: Completes the sync/async-symmetric execution story plus the deferred-execution and documentation mechanisms from issue 689 (phase 5 + docs), still without touching any existing API. what: - engines.asyncio: real AsyncSubprocessEngine on create_subprocess_exec (terminates the child on cancellation; not a thread wrapper), mirroring the classic engine's output handling so it returns the same typed result - ops.plan: LazyPlan records operations without touching tmux and resolves SlotRef forward refs at execute time via a sans-I/O generator; sync execute() and async aexecute() share one resolution core (run vs await arun is the only divergence); whole-plan serialization round-trips - ops.catalog: registry-driven CatalogEntry list (scope, version gates, effects, safety, result type, summary) -- the single source a docs domain renders, so runtime and docs cannot drift - tests: lazy resolution sync+async, plan serialization, catalog coverage, async-vs-sync classic parity against a real tmux server
why: Proves control mode is just another engine returning the same typed result (issue 689 phase 4) -- an operation run over a persistent tmux -C connection is indistinguishable, at the result level, from one run via fork-per-call subprocess. what: - engines.control_mode: ControlModeEngine over one persistent tmux -C connection; run_batch pipelines commands and parses each command's %begin/%end/%error block into a CommandResult; selectors-based nonblocking reads with timeout; startup-ACK discard; lifecycle via close()/context manager (lock-guarded teardown) - engines.control_mode: I/O-free ControlModeParser, unit-testable without tmux, adapted from the chain runner + protocol-engines parser - register control_mode in the engine registry and export it - tests: pure parser tests + real-tmux contract (split creates a real pane, batched commands, control-vs-concrete parity)
why: Demonstrates the "mode lives in the type" model from issue 689 -- EagerPane.split() returns a live EagerPane while LazyPane.split() returns a deferred LazyPane, each a single statically-known return type, both backed by the same SplitWindow operation. One Pane class with a runtime-bound engine could not type these return values distinctly. what: - facade.pane.EagerPane: executes immediately, returns live handles (split -> EagerPane), typed results for capture/send_keys - facade.pane.LazyPane: records into a LazyPlan, returns deferred handles (split -> LazyPane bound to the new pane's SlotRef), chainable - seed of the wider Server/Session/Window/Pane/Client x mode matrix - tests: eager live handles, lazy deferral + forward-ref resolution, and same-operation-backs-both-facades parity
why: Closes the two async gaps from issue 689: control mode and concrete had no async sibling. The async control engine is the one async engine that earns its place -- it adds an event stream subprocess cannot -- and prior libtmux/mux control-mode work (surfaced across agent histories via agentgrep, plus the asyncio-2 branches) shaped its correlation design. what: - engines.async_control_mode: AsyncControlModeEngine over a persistent tmux -C (create_subprocess_exec + one reader task). FIFO future correlation with skip-when-empty so unsolicited %begin blocks (hook- triggered commands and the startup ACK) never desync results; the startup ACK is consumed synchronously in start() to close the correlation race our whole-block parser would otherwise have. DEAD state fails pending commands on reader EOF/error. Cancellation via asyncio.wait_for (3.10 floor: no asyncio.timeout/TaskGroup). Bounded subscribe() notification stream with drop-counting. for_server() helper - engines.control_mode: ControlModeParser now surfaces bare %-notification lines via notifications() (additive; the sync engine ignores them) - engines.concrete: AsyncConcreteEngine sibling over shared simulation; removes the async test shim - ControlNotification typed event value - tests: parser notification/drain; async control vs real tmux (split, pipelined batch, concrete parity, live event stream, lifecycle)
why: Many tmux commands print nothing (rename-window, kill-pane, select-window, ...). tmux returns CMD_RETURN_NORMAL on success or calls cmdq_error on failure, framed in control mode as %end vs %error (see tmux cmd-queue.c) -- they never cmdq_print. They still need a typed result that records success/failure without inventing a payload. what: - results.AckResult: a typed acknowledgement (no payload) whose raise_for_status() still surfaces the error path; documents the tmux success/error mapping - retarget send-keys and select-layout to AckResult (both print nothing) - add no-output ops: rename-window (mutating), kill-window and kill-pane (destructive) -- exercising AckResult across scopes and safety tiers - export AckResult and the new ops; refresh the catalog doctest - tests: render + AckResult success/failure across the no-output ops and destructive safety metadata; update classic/control parity assertions
why: A neo-like read model is useful, but neo.Obj is one flat ~200-field class fused to the query/dispatch pipeline. The experimental namespace lets us try a decoupled, immutable, serializable snapshot layer without any risk to the shipped ORM APIs. what: - libtmux.experimental.models: frozen PaneSnapshot / WindowSnapshot / SessionSnapshot / ServerSnapshot, each a typed core plus the full raw tmux-format tail in .fields (nothing tmux reported is lost) - from_format() builds one node from a format mapping; ServerSnapshot.from_pane_rows() groups a flat "list-panes -a -F" row set into an ordered session/window/pane tree - to_dict()/from_dict() round-trip the whole tree as plain data, with no live objects - pure tests (no tmux): value coercion, tree grouping/order, round-trip
why: The list/show read commands overlap neo's reader. Rather than touch the ORM, add a parallel typed read surface in experimental.ops that yields immutable models snapshots. The render version must thread into result parsing first, because the -F template is version-gated and the parser must split against the same fields it was rendered with. what: - operation: thread `version` through build_result -> _make_result so payload parsing matches the version-gated render (backward compatible; existing overrides accept and ignore it); execute.run/arun pass it - ops._read: re-export neo.get_output_format / parse_output and formats.FORMAT_SEPARATOR as the single source of truth (no copies) - list-panes / list-windows / list-sessions ops (readonly, chainable=False) render the same -F template neo builds and parse rows into models snapshots - ListPanesResult/.../ store JSON-friendly rows and derive typed views (.panes/.server/.windows/.sessions) via properties, so results serialize and round-trip with no special-casing - tests: -F parity with neo, snapshot-tree build, serialize round-trip, and live list-panes/sessions/windows against a real tmux server
why: The operation catalog is registry-derived data, so rendering it in docs keeps the operation reference from drifting from the code -- and the docs gate then exercises catalog() on every build. what: - docs/_ext/tmuxop.py: an in-repo Sphinx directive `tmuxop-catalog` that walks libtmux.experimental.ops.catalog() and emits a table, with :scope:/:safety:/:primitive-only: filters; warns (not raises) on empty - conf.py: add docs/_ext to sys.path and 'tmuxop' to extra_extensions - docs/experimental.md: an experimental ops/engines overview embedding the catalog (full + readonly + destructive views), in the index toctree
why: The sync control engine skipped tmux's startup ACK with a fragile one-shot flags==0 heuristic and had no defense against hook-emitted %begin/%end blocks, so a stray block could desync request->result alignment. The async engine already handles this; backport the approach. what: - consume the startup ACK synchronously at connect (_consume_startup), dropping the one-shot _startup_ack_pending heuristic, so the startup block can never be conflated with a command's result block - drain buffered unsolicited blocks before each batch (_drain_unsolicited), so a hook-triggered command's block left over from a prior call is not mis-attributed to the next command - drain notifications during reads to keep the parser buffer bounded - regression test: many sequential commands stay aligned (first result is real; each call drains before reading its own block) A hook firing mid-pipelined-batch still needs per-command number correlation to disambiguate; single-command run() is robust.
why: The chainable-commands prototype folds independent commands into one "tmux a ; b" dispatch. Our typed-op model is a better host for it -- the Operation already carries a `chainable` classvar and the result Status already reserves `skipped` for exactly the chain-drop case. So yes, lazy mode can adopt the prototype's chainability. what: - mark output/creation ops non-chainable (capture-pane, split-window; list-* already were) so a fold never drops captured data or an id - ops._chain: render_chain (join chainable ops with standalone ';', escaping a trailing-';' arg), ensure_chainable (fail closed), and attribute -- splitting one merged ';'-chain result into a typed result per op (success -> all complete; failure -> first failed, rest skipped, matching tmux cmd-queue.c cmdq_remove_group); plus OpChain with >>/then - Operation.__rshift__/then compose into an OpChain; result_with_status() builds a result with an explicit status (skipped/failed attribution) - LazyPlan.execute/aexecute gain fold=False (opt-in): maximal runs of chainable, resolved ops dispatch once via engine.run; the sans-I/O _drive yields _Single or _Chain so sync and async share the core; add_chain() records an OpChain - tests: >> composition, render_chain, fold=one dispatch, fold-off=N dispatches, failure attribution, creators stay unfolded, add_chain
why: Extend the mode-in-the-type facades beyond the pane seed so a typed return value distinguishes eager/lazy/async across scopes -- and add the few creation ops the cross-scope navigation needs. what: - ops: NewWindow / NewSession (CreateResult, capture the new id), KillSession, RenameSession; generalize binding capture via Result.created_id (base None; SplitWindowResult -> new_pane_id; CreateResult -> new_id) so lazy plans bind window/session creations too - facade: eager Server -> Session -> Window -> Pane navigation (EagerServer/EagerSession/EagerWindow); LazyWindow (records into a plan); AsyncPane / AsyncWindow (await arun) -- all over the same ops. Control mode stays an engine choice, not a separate facade family - EagerServer.for_server() binds the classic engine to a live Server - tests: offline navigation across scopes/modes (concrete engine), and a live eager Server -> Session -> Window -> Pane build against real tmux with cleanup
why: The native binary peer-protocol engine is the strongest proof the
operation/result contract is transport-agnostic -- the same typed
CommandResult whether produced by a subprocess, tmux -C, or by speaking
tmux's imsg protocol directly. Research confirmed it is pure-stdlib and
CI-verifiable; the prototype it is ported from only ever tested against a
fake socketpair server, never real tmux.
what:
- port engines/imsg/{types,v8,base}.py from libtmux-protocol-engines:
ImsgEngine over AF_UNIX + sendmsg/recvmsg + SCM_RIGHTS fd-passing, and
ProtocolV8Codec (=IIII header, IMSG_FD_MARK high bit of len,
peerid=PROTOCOL_VERSION 8, IDENTIFY -> COMMAND -> WRITE_* -> EXIT
handshake); posix_spawn local fallback for attach / start-server /
no-server-running
- adapt to the experimental tuple CommandResult (drop the process field);
add imsg.exc (ImsgError / ImsgProtocolError / UnsupportedProtocolVersion)
and select the v8 codec directly; keep the version-mismatch retry
- register as the opt-in "imsg" engine; import-safe everywhere (AF_UNIX
is only touched at runtime; tests skip without it)
- tests: v8 codec round-trip + MSG_COMMAND framing (no tmux), plus the
live parity test the prototype lacked -- ImsgEngine vs SubprocessEngine
return identical stdout/returncode for read-only commands against a
real tmux server (runs across the CI tmux matrix)
why: Finish the mode-in-the-type matrix so every tmux scope has eager/lazy/async facades, and add the client-scoped ops a Client facade needs. The matrix is now 5 scopes x 3 modes, all over the shared spine. what: - ops: detach-client, refresh-client, switch-client (AckResult, client scope; switch-client renders -c/-t rather than the generic target) - facade: LazyServer/AsyncServer, LazySession/AsyncSession, and the new client scope (EagerClient/LazyClient/AsyncClient); AsyncServer.for_server binds the async engine to a live Server - tests: a lazy full Server->Session->Window->pane plan, async navigation, and eager/lazy/async client methods
why: The pre-commit gate now runs `uv run ty check`, so ty must be a configured dev tool. Brings the ty setup from the add-ty-type-checker branch and makes the experimental tree ty-clean. what: - add `ty` to the dev dependency group (uv.lock updated) - add [tool.ty] (environment py3.10, src=src/tests) with the documented rule ignores for known ty false positives, ported verbatim - fixes ty surfaced in experimental: Target is now a real union (ty rejects an implicit two-string type alias); OperationRegistry.list -> select so the `-> list[OpSpec]` return annotation is not shadowed by the method name
why: tmuxp's window_index config key places a window at a chosen
session index; the builder always appended, ignoring it.
what:
- ir: Window.window_index (threaded through analyze/to_dict)
- compiler: a created window (2..N) with window_index targets
new-window at `session:N` by suffixing the session SlotRef (":N"),
so the captured window-id binding is preserved -- zero Core change
- test: window_index renders new-window -t $1:5 and still binds the id
note: window 0 reuses the session's implicit window and keeps the base
index; append-into-existing-session mode (tmuxp load -a) is deferred as
a follow-up -- it restructures the build flow (no new-session, all
windows created) and the fresh-session reuse model is faithful for the
common case.
why: the async-first control-mode server lacked the Declarative tier — build_workspace was sync-only — so an agent on an async engine could not build a whole workspace in one call (a documented asymmetry). what: - plan_tools: abuild_workspace, the async sibling over analyze(spec).abuild(engine) - fastmcp_adapter: register an async build_workspace on the async server, backed by abuild_workspace (mirrors execute_plan's conditional-variant type:ignore) - export abuild_workspace from the mcp package - test: the async server lists + calls build_workspace offline
why: porting libtmux-mcp's safety surface into the core adapter needs a single source of truth for the safety tiers and the agent-correctable error type, ahead of the middleware and the tag-gate. what: - _safety.py: TAG_readonly/mutating/destructive, VALID_SAFETY_LEVELS, _TIER_LEVELS, resolve_safety_level (None->mutating, valid->verbatim, invalid->warn+readonly fail-safe), ExpectedToolError(ToolError) (log_level=WARNING default + suggestion) — fastmcp+logging deps only, off the framework-agnostic import path - tests: resolver defaults/fail-safe-with-warning + ExpectedToolError
why: fastmcp's stock transform funnels every expected failure through a -32603 "Internal error:" catch-all, and its response limiter drops the tail (terminal scrollback's useful output is at the bottom). what: - new mcp/middleware.py with the real fastmcp base-class imports - ToolErrorResultMiddleware: tool failures -> ToolResult(is_error) with the clean message + typed meta (error_type/expected/suggestion); _log_error demotes ExpectedToolError + schema-validation to WARNING - TailPreservingResponseLimitingMiddleware: keeps the tail, prefixes a truncation header, re-attaches is_error the base path drops - the schema-validation + suggestion helpers (no raw input echoed), _RESPONSE_LIMITED_TOOLS (engine-ops scrollback tools) - dropped libtmux-mcp's global fastmcp-log-filter side effect - tests: tail-keep, error-result meta/suggestion, schema redaction
why: complete the middleware stack — the runtime safety gate (defense in depth behind the static tag-gate), a structured audit trail, and retries scoped to readonly tools so a transient socket error never double-runs a mutating tool. what: - SafetyMiddleware: fail-closed tier gate on list + call (untagged tool denied); raises ExpectedToolError on an over-tier call - AuditMiddleware: one INFO record per call, restructured to the project logging standard (static message + structured extra: tmux_subcommand/ outcome/duration_ms/tmux_args), payload args digested (len+sha256) - ReadonlyRetryMiddleware: composes fastmcp RetryMiddleware, delegates only for readonly-tagged tools; trigger LibTmuxException - loggers namespaced libtmux.experimental.mcp.audit/.retry - single tier source: _TIER_LEVELS/TAG_* imported from _safety - tests: audit redaction, fail-closed _is_allowed, retry pass-through
why: replacing libtmux-mcp needs the safety tier-gate and the middleware stack on the engine-ops servers — gating destructive tools by LIBTMUX_SAFETY (default mutating) and adding the timing/limit/error/ audit/retry/safety chain. what: - _apply_safety_gate (Option A, subtractive): disable only the over-tier tiers AFTER register_operations, so the per-op hide is never undone — destructive op_* stay hidden at every tier (regression-tested) - _make_middleware builds the outer->inner stack (Safety innermost, fail-closed); passed at FastMCP(middleware=...) construction - build_server/build_async_server grow safety_level + include_middleware; level resolved in-body (env read deferred -> monkeypatchable) - main() gains --safety; default_server/main forward it - tests: static visibility per tier, the per-op re-exposure regression, destructive-call blocked at readonly, plan-tool tier - existing kill_*/op_kill_* tests opt into safety_level="destructive" (the new default tier hides destructive tools, as intended)
why: libtmux-mcp ships workflow prompts (run-and-wait, diagnose, build-workspace, interrupt) that package operator-discovered best practices; the engine-ops server should offer the same, in its own vocabulary. what: - prompts.py: the four recipes rewritten over the engine-ops verbs (send_input/wait_for_output/capture_pane/create_session/split_pane), not libtmux-mcp's run_command/snapshot_pane/send_keys/split_window - register_prompts(mcp) via Prompt.from_function; pure string builders, identical on the sync and async servers - both builders gain include_prompts (default True); registered after the caller context - tests: the four prompts register; rendered bodies name only engine-ops tools (guards prompt tool-name drift)
why: libtmux-mcp exposes the server->session->window->pane tree as MCP
resources (a read interface distinct from the list_* tools); the
engine-ops server should too, built on its own vocabulary.
what:
- resources.py: register_resources(mcp, engine, *, is_async) with six
tmux:// resources (sessions, session detail, session windows, window
detail, pane detail, pane content) over alist_sessions/windows/panes +
acapture_pane; rows filtered by session_name/window_index/pane_id
- single async body set; a sync server's engine is wrapped once
(SyncToAsyncEngine) so there is no sync/async duplication
- drop libtmux-mcp's {?socket_name} query var (one socket per engine)
- both builders gain include_resources (default True)
- tests: offline read returns JSON; live read lists the session + pane
content over a real tmux server
why: fail fast when the engine cannot reach tmux at startup (missing binary, broken connection) instead of surfacing it on the first tool call — parity with libtmux-mcp's preflight. what: - _lifespan.py: make_lifespan(engine) runs list-sessions at startup and raises RuntimeError only on an engine-broken outcome (it raises), never on a tmux-side error (returned as a CommandResult, e.g. no server) - build_async_server gains lifespan (default True), passed at FastMCP construction; the sync server stays lifespan-less - tests: broken engine fails the preflight; a tmux-side error is tolerated note: the paste-buffer GC half of libtmux-mcp's lifespan is deferred — engine-ops does not namespace MCP-created buffers, so there is no prefix to GC (a follow-up).
why: the declarative workspace tier had no human entry point — building a workspace meant calling analyze()+build() in Python. Mirror `tmuxp load` so a .tmuxp.yaml launches from the shell. what: - workspace/cli.py: `python -m libtmux.experimental.workspace.cli load <file>` resolves a workspace file (path / directory -> .tmuxp.*/ bare name under $TMUXP_WORKSPACEDIR), expands ~/$VAR/./ paths relative to the file's dir (the cwd-bound step analyze() deliberately omits), analyzes + builds over a SubprocessEngine, then attaches (switch-client when inside tmux) unless -d; -L/-S socket, -s session-name override - an already-running session is attached, not rebuilt (FileExistsError -> attach), matching tmuxp's behavior - tests: file resolution (path/dir/missing), ./-relative path expansion, arg parsing, and a live detached build whose windows/panes match the file
why: real .tmuxp.yaml files use `- blank` / `- pane` / `- ` to mean "an empty pane" (no command) — the analyzer was sending those as literal commands. And launching a file blind is risky; a dry run lets you see the tmux commands first. what: - analyzer: a pane whose sole content is None / "blank" / "pane" / "" (a bare string or a single-element shell_command) is now an empty pane, matching tmuxp's expand_cmd; a blank mixed with real commands is left alone - cli: `load --dry-run` prints the tmux command lines (resolved against the in-memory ConcreteEngine so ids render) with host steps as comments, executing nothing - tests: blank/pane/empty shorthands -> empty panes; dry-run prints the commands (blank pane creates a split but sends no keys) and starts no tmux server
why: window 0 reuses the session's implicit window/pane, so its first pane inherited the *session* start_directory (-c on new-session) instead of the window's. A per-project tmuxp config (each window cd'd into its repo) opened window 1's first pane in the session root, not the repo. what: - compiler: _creator_start_directory folds the window's (and its first pane's) start_directory into the creator's -c with pane -> window -> session precedence; used for both new-session (window 0) and new-window (windows 2..N). A window without start_directory still falls back to the session's, so existing behavior is unchanged. - test: window 0's start_directory drives new-session -c; fallback to the session dir; a first pane's own start_directory wins
why: The declarative runner needs to fold tmux dispatches yet still interleave host-side steps (sleeps, pane-ready waits) between them. These additive Core primitives let any driver reuse the plan trampoline for that without putting host I/O in the sans-I/O core. what: - Add StepReport + _Host sentinel; _drive yields it after each step binds its results (the sched.delayfunc(0) seam), performing no I/O - Add an on_step hook to execute/aexecute; extract _adispatch as the async twin of _dispatch so both pumps share one dispatch seam - Add BoundedPlanner: run an inner planner over the full op list, then split its steps wherever a host-step boundary falls (a marked fold demotes to plain ; chains past the boundary) - Export BoundedPlanner and StepReport from the ops package - Test the hook stream, sync/async parity, and bounded splitting
why: A declarative build paid one tmux dispatch per operation because the runner forked its own per-op loop to interleave host steps, bypassing the Core planner. A multi-pane window now renders in a few round-trips instead of dozens, with the same result. what: - Drive build_workspace/abuild_workspace through LazyPlan.execute with BoundedPlanner(MarkedPlanner, frozenset(host_after)) and an on_step hook that replays each index's host steps and build events, deleting the hand-rolled per-op loop - Default the build to folding; add planner= to the runner functions and Workspace.build/abuild so a caller can override (e.g. SequentialPlanner for one legible tmux call per op) - host_after keys are the fold boundaries, so sleeps, the wait_pane anti-race, and before_script keep a fold from ever crossing a pause; the PlanResult is identical, only the dispatch count drops - Add folding contract tests (dispatch reduction, planner equivalence, boundary rules, live subprocess) and a CHANGES deliverable
why: The dry run rendered the unfolded sequential plan, but the build
folds by default -- so the preview misrepresented the dispatches that
would actually run (one tmux line per op instead of the ; chains).
what:
- Drive the dry run through the same BoundedPlanner(MarkedPlanner) the
build uses, via a recording engine, so the printed lines are the real
folded dispatches; a standalone ; renders as \; (copy-pasteable) and
the header reports the dispatch count and shape
- Add --no-fold to load (and a fold= param) that controls BOTH the dry
run rendering and the real build planner, keeping them consistent
- Cover the folded/{marked} dry run, --no-fold, and flag parsing
why: The engine-ops spine had 60 operations but none for tmux 3.7's new-pane (floating panes); the workspace builder, facade, and MCP had nothing to lower a floating pane into. what: - Add NewPane(Operation[SplitWindowResult]) rendering new-pane with absolute floating geometry (-x/-y size, -X/-Y position; cells or N%), -Z/-d/-E, styles, environment, and -P -F capture - Reuse SplitWindowResult so SlotRef binding, facade, and MCP keep working unchanged; first op to set min_version='3.7' (whole-command version gate) - Register + export NewPane; refresh the catalog all-kinds doctest - Cover render/round-trip/registry/version-gate plus a live floating pane test asserting pane_floating_flag on tmux 3.7+
why: tmux 3.7 NULL-derefs the server on a nameless break-pane (fixed upstream after 3.7) and ignores -n when one is given. The experimental BreakPane op emitted no -n for nameless breaks, crashing the 3.7 server. Mirrors the fix already shipped in Pane.break_pane (#693). what: - Inject a placeholder -n on exactly tmux 3.7 when no name is requested - Gate via _normalize_tmux_version exact match; other builds render bare - Document the workaround and cover placeholder/bare/named render paths The gate fires only when a tmux version reaches args(); the engine version resolution that activates it for live runs lands next.
why: Operations are version-aware, but execution defaulted to version=None, so version-gating (flag drops, whole-command gates, the break-pane 3.7 workaround) silently did nothing unless a caller threaded the version by hand. This is why test_break_and_swap_live still crashed even with the BreakPane workaround in place. what: - Add the optional SupportsTmuxVersion engine capability (base.py) and implement tmux_version() on the subprocess + asyncio engines (memoized `tmux -V`, None when unknown) - Add resolve_engine_version() and use it in run()/arun() and at the LazyPlan execute()/aexecute() entry points so the live tmux version reaches rendering when the caller passes none - Explicit version still wins; engines without the capability assume latest, so fakes and the in-memory engine are unaffected - Cover resolution + gating activation for run/arun and a folded plan; this greens test_break_and_swap_live on tmux 3.7
why: The declarative workspace IR had no way to express tmux 3.7 floating panes; a user could not declare a floating overlay (e.g. a lazygit popup) in a spec at all. what: - Add a Float geometry value type (width/height -> -x/-y size, x/y -> -X/-Y position; cells or N%) and FloatingPane (a Pane + Float + attach_to) - Add Window.floats: Sequence[FloatingPane] overlays, kept as a plain declarative data shape like panes (NOT a live QueryList -- QueryList is the live object-query layer, not the spec) - Round-trip floats through analyze()/to_dict(); export Float + FloatingPane from the workspace package - Cover to_dict, defaults, and round-trip Inert data only; the compiler emit + events/confirm wiring lands next.
why: Declared floating panes (Commit prior) were inert -- the compiler had no branch to lower them, so a float-bearing workspace ignored its overlays. what: - Factor per-pane command sending into _emit_pane_commands, shared by tiled panes and floats (uniform wait_pane / suppress_history / sleeps) - Emit each Window.floats overlay as a NewPane after the tiled layout, targeting the window's first pane and kept out of the split chain and select-layout; send the float's own commands and honor its focus - events: emit PaneCreated for new_pane; confirm: fold floats into the expected pane count (tiled + floats) so confirm() does not flag a spurious mismatch - Reject cross-window attach_to for now (the symbol table lands next) - Cover compile order, geometry/command emission, the attach_to guard, an offline in-memory build, and the new_pane event
why: A floating pane could only attach to its host window; the compiler rejected attach_to pointing at another window. Cross-window overlays (e.g. a status float over a different window) need name-based references resolved across the whole spec. what: - Add a Symbols registry (Django app-registry style): each declared window publishes its first-pane SlotRef by name, so a float's attach_to resolves to any window declared anywhere (forward or backward) - Add _topo_order, a graphlib.TopologicalSorter primitive that orders the reference graph (floats after the windows they attach to) and rejects cycles -- the seam for future join-pane / cross-window ops - Compile floats in a second wire phase after every window exists, so cross-window SlotRefs always resolve; lift the cross-window raise and instead raise only for an undeclared attach_to name - Cover cross-window attach (forward ref), offline build, unknown attach_to, Symbols.resolve, and _topo_order ordering + cycle detection
why: The spine could list panes, but there was no ergonomic, chainable way to filter/order/project live panes the way QueryList powers server.panes -- the read half of the chainable-prototype DX. what: - Add panes() -> PaneQuery: an immutable, chainable query (filter/order_by/limit/all/first/map) over live panes - Resolve against a source that is either a TmuxEngine (a list-panes read) or a pure Sequence[PaneSnapshot]; filtering reuses QueryList so Django-style lookups (active=True, current_command="vim") work on snapshots - map() returns a MappedPaneQuery for pure data projections - Cover filter/order/limit/map/first/immutability, the empty-engine source, and a live engine-backed query scoped by window This is the live-object query layer (distinct from the declarative workspace IR); the command-building half (PaneRef + commands) is next.
why: The query read live panes but could not act on them. The chainable prototype's headline DX is "do X to every pane matching Y in one tmux call" -- bulk commands over a filtered set, folded to a single dispatch. what: - Add PaneRef (a matched pane + a cmd namespace) and BoundPaneCommands (send_keys/resize/select/respawn/clear_history/kill), each recording a typed op into a shared plan - Add PaneQuery.commands(mapper) -> CommandPlan; CommandPlan.to_plan builds the ops against a snapshot (pure/inspectable) and CommandPlan.run reads the engine, builds, and dispatches folded (FoldingPlanner) by default - Layered entirely over LazyPlan/SlotRef/Planner -- no new execution path - Cover op-per-pane building, each command kind, the empty-match no-op, and a live folded run The bulk-command layer over the live query (G18); fluent split/forward handles remain a possible follow-up.
why: The typed pane facades exposed split() but not new_pane(), so floating panes were reachable from the ops/workspace tiers but not the eager/lazy/async handles that are the modern facade surface. what: - Add new_pane() to EagerPane (live handle), LazyPane (deferred handle over the plan), and AsyncPane (awaited live handle), each returning a handle to the created floating pane - Share a _new_pane_op builder across the three facades so the floating geometry vocabulary (width/height/x/y/zoom/empty/styles/...) stays in one place - Cover eager/lazy/async new_pane (live handle, recorded op + render, awaited handle)
why: NewPane auto-projects as op_new_pane, but that surface is hidden
behind the per-op tag; agents reach for the curated, always-visible
vocabulary. Floating panes had no curated tool, so they were effectively
undiscoverable.
what:
- Add anew_pane to the pane vocabulary (async-first) and new_pane =
synced(anew_pane); FastMCP derives the input schema from the signature
and the output schema from PaneResult
- Register ("new_pane", "mutating") in the adapter _TOOLS table; export
anew_pane/new_pane from the vocabulary and new_pane from the mcp facade
- The tool description notes the tmux 3.7+ requirement
- Cover the curated new_pane tool over the in-memory engine
Surfacing whole-op min_version into the auto-projected op_* schema
(G8) remains a small follow-up.
why: The descriptor projected per-flag version gates but not a whole operation's min_version, so the auto-projected op_new_pane advertised no tmux requirement -- an agent on an older tmux would hit a raw VersionUnsupported instead of a documented gate. what: - Add ToolDescriptor.min_version, populated from OpSpec.min_version - Append "Requires tmux >= X.Y." to the projected tool description when a whole-command gate is set - Cover op_new_pane surfacing min_version 3.7 (and an ungated op not)
why: wait_for_output takes target=, not pane=; a recipe emitting pane= would fail FastMCP schema validation before dispatch. what: - Replace pane= with target= in run_and_wait, diagnose_failing_pane and interrupt_gracefully - Add parametrized regression test asserting target= usage
why: A non-str/non-Mapping shell_command item (int, float, list) was silently dropped, hiding malformed config from the user. what: - Raise TypeError on unsupported shell_command items, matching the module's existing "unsupported pane config" error style - Keep None tolerated (a blank mixed with commands, tmuxp parity) - Add parametrized tests for rejected and normalized items
why: A split pane with its own environment dropped the window environment entirely, contradicting the documented "inherited by its panes" contract and the first pane's merged creator env. what: - Merge window + pane environment for split-window -e (pane wins) - Correct the creator-env test to assert the merged split env - Add parametrized tests for window/pane env precedence
Summary
Implements the typed operations + engines architecture under
libtmux.experimental.{ops,engines,models,facade}— an inert, statically-typed operation spine; a family of interchangeable engines (subprocess, concrete, control-mode, async-subprocess, async-control, and the native imsg easter-egg); lazy/async-lazy plans with;-folding chainability; pure object-graph snapshots; a typed read surface; engine-typed facades; and a docs catalog generated from the registry.Operationalizes #688 (architecture) per the plan in #689. Touches no existing public API — everything is additive under
libtmux.experimental(explicitly outside the versioning policy). Nothing is generated at runtime; everything is statically typed and mypy-strict clean.What's delivered
The spine —
libtmux.experimental.ops(pure, no tmux):Operation[ResultT]: frozen, keyword-only, class-vars as the single source of truth (kind/command/scope/result_cls/effects/safety/chainable/version gates). Purerender()with declarative version gating;build_result()adapts raw output to a typed result (version-threaded so read parsing matches the gated render).Resulthierarchy with opt-inraise_for_status():AckResult(no-output commands — success/failure only),SplitWindowResult/CreateResult(captured ids),CapturePaneResult(lines),ListPanes/Windows/SessionsResult(snapshot-deriving rows).Targetsum, fail-closedOperationRegistry, stdlib serialization, andcatalog()(registry-derived docs data).LazyPlan(record → resolveSlotRefforward refs → execute) with chainability:>>/OpChaincomposition andexecute(fold=True)folding chainable runs into onetmux a ; bdispatch, attributing per-op status (success → all complete; failure → first failed, rest skipped, matching tmux'scmdq_remove_group).ListPanes/ListWindows/ListSessionsops render the same-Ftemplate neo uses (imported, not copied) and parse intomodelssnapshots — a typed read surface parallel to neo, leaving the ORM untouched.Engines —
libtmux.experimental.engines(all behindTmuxEngine/AsyncTmuxEngine, all returning the sameCommandResult):SubprocessEngineAsyncSubprocessEngineConcreteEngineAsyncConcreteEnginetmux -C)ControlModeEngineAsyncControlModeEngine(event stream viasubscribe())ImsgEngine(opt-in easter egg)Control engines use an I/O-free bytes
ControlModeParserwith FIFO/skip correlation (startup-ACK consumed up front; unsolicited hook blocks skipped). The imsg engine speaks tmux's binary peer protocol directly (AF_UNIX+SCM_RIGHTS,PROTOCOL_VERSION8) and has a live parity test vs the subprocess engine the prototype never had.Models —
libtmux.experimental.models: frozenPane/Window/Session/ServerSnapshot(typed core + raw field tail),from_pane_rows()builds the whole tree from onelist-panes -aquery, round-trips to plain dicts — neo-like but decoupled and serializable.Facades —
libtmux.experimental.facade("mode lives in the type"): eagerServer→Session→Window→Panenavigation,LazyWindow/LazyPane,AsyncWindow/AsyncPane— all over the same ops; control mode is just an engine choice.Docs: an in-repo
tmuxop-catalogSphinx directive renderscatalog()into the operation reference (exercised by the docs gate), so the reference can't drift from the code.Testing
ruff,ruff format,mypy --strict,pytest(1501 passed, 2 skipped),build-docs. (The occasionaltest_retry.pytiming flake is pre-existing and unrelated — passes in isolation.)Design notes
raise_for_status(). Same result shape across engines.attach(which falls back to a local spawn).Refs #688, #689.