Skip to content

perf(sidecar): cut guest fs RPC latency (fs-heavy workloads 5.7–41× faster)#77

Merged
NathanFlurry merged 3 commits into
mainfrom
perf-sidecar-guest-rpc
Jun 19, 2026
Merged

perf(sidecar): cut guest fs RPC latency (fs-heavy workloads 5.7–41× faster)#77
NathanFlurry merged 3 commits into
mainfrom
perf-sidecar-guest-rpc

Conversation

@NathanFlurry

@NathanFlurry NathanFlurry commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

Three optimizations to the Rust sidecar runtime that speed up guest workloads which
read a mounted host node_modules (and any fs-heavy guest). Each is its own commit.
Benchmarked with an out-of-tree harness (mounted off-the-shelf node_modules;
recursive walk + 1500× stat + 1500× read + module imports), baseline vs. fixed stack
measured back-to-back on an idle host.

Headline: host-filesystem operations 5.7–41× faster (0.3 cold)

operation baseline fixed speedup
recursive walk (909 readdir) 32396 ms 785 ms 41×
stat ×1500 7564 ms 1338 ms 5.7×
read ×1500 small files 7610 ms 1221 ms 6.2×

Module-import time is compile/eval-bound and was flat within noise — this work
targets cross-thread RPC latency, which dominates fs-heavy paths, not compile.

Commits

  1. stable compile-cache rootdefault_compile_cache_root was keyed by PID, so
    every fresh sidecar started with an empty V8 compile cache. Use a stable path
    (entries stay namespaced + V8-validated). Neutral on these micro-benchmarks but
    conceptually correct for bootstrap/repeated starts; 1 line, harmless.
  2. typed readdirfs.readdirSync({withFileTypes:true}) returned names only, so
    the guest issued one cross-thread stat RPC per entry to build each Dirent,
    making a directory walk O(total-entries) RPCs. The handler already openat2s each
    child to validate it stays beneath the mount; we now fstat that fd and return
    {name,isDirectory} (the guest already consumes typed entries). metadata()
    follows symlinks, matching prior statSync semantics (file count unchanged).
    Walk 32.4s → 4.6s on its own.
  3. 250 µs event-pump interval — guest sync fs/module RPCs are serviced by
    pump_process_events, which the stdio select loop only ran on the 5 ms
    EVENT_PUMP_INTERVAL timer, so each blocked guest call waited ~5 ms before the
    host dequeued it. The sub-ms tokio timer is honored; idle pumps are cheap no-ops,
    so the higher cadence costs negligible CPU. stat 7.5s→1.3s, read 7.6s→1.2s; with
    ci: add pkg.pr.new preview workflow #2, walk 32.4s→0.79s.

Root cause

The guest V8 isolate runs inside the sidecar process; a guest fs.statSync is a
cross-thread synchronous RPC to the execution loop (not a socket, not cross-process).
Two things dominated: readdir not returning entry types (→ per-entry stat RPCs), and
the 5 ms pump timer gating when those RPCs were serviced.

Tried and reverted / deferred

  • Combined resolveAndLoad (format+source in one RPC) regressed module import
    ~4× — module RPC responses are delivered as raw strings, so the {format,source}
    object wasn't consumed and every module hit a slower readFileSync fallback.
  • Adaptive (fine/coarse) and spin-wait pump variants were unstable (livelock /
    oscillation) under load; the flat 250 µs interval ships instead.
  • Binary readFileSync (skip base64) — fully traced: _fsReadFileBinary → fs.readFileSync(binary), and binary is {__agentOsType:"bytes", base64} because
    the sync-RPC return type is a JSON Value. True raw transfer needs switching
    readFileSync to the status=2 raw-binary bridge-response path — a cross-cutting
    protocol change risking every binary read for a now-modest gain (read already 6.2×
    faster). Deferred as a follow-up rather than rushed.

Follow-ups

True event-driven pump (notify channel execution→stdio loop, removes the residual
timer wait), cache the host_dir mount-root fd, and the binary readFileSync path.

NathanFlurry and others added 2 commits June 18, 2026 23:23
…code

default_compile_cache_root was keyed by process id, so every fresh sidecar got
an empty cache and cold module imports never reused compiled bytecode. Use a
stable temp path; entries remain namespaced + V8-validated, so sharing is safe.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fs.readdirSync({withFileTypes:true}) previously returned names only, so the guest
issued one cross-thread stat RPC per entry to build each Dirent. The readdir
handler already openat2's every child to validate it stays beneath the mount, so
we now fstat that fd in-process and return {name,isDirectory}. The guest's
normalizeReaddirEntries already consumes typed entries. metadata() follows
symlinks, matching prior statSync semantics (file count unchanged). Recursive
walk of node_modules: ~32.4s -> ~4.6s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@railway-app

railway-app Bot commented Jun 19, 2026

Copy link
Copy Markdown

🚅 Environment secure-exec-pr-77 in rivet-frontend has no services deployed.

Guest sync fs/module RPCs are serviced by pump_process_events, which the stdio
select loop only runs on EVENT_PUMP_INTERVAL. At 5ms a blocked guest call waited
~5ms before the host dequeued it (~5ms/stat). 250us (the sub-ms tokio timer is
honored) cuts it dramatically: over the fs benchmark walk 32.4s->0.79s,
stat 7.5s->1.3s, read 7.6s->1.2s. Idle pumps are cheap no-ops so the higher
cadence costs negligible CPU. (A true event-driven wake would remove the residual
timer wait but needs a notify channel from the execution layer; an adaptive
interval was tried but proved unstable.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@NathanFlurry NathanFlurry force-pushed the perf-sidecar-guest-rpc branch from 397377a to d71dc89 Compare June 19, 2026 08:10
@NathanFlurry NathanFlurry changed the title perf(sidecar): cut guest fs RPC latency (fs-heavy workloads 3.5–24× faster) Jun 19, 2026
@NathanFlurry

Copy link
Copy Markdown
Member Author

Follow-up work split out into issues:

@NathanFlurry NathanFlurry merged commit a621306 into main Jun 19, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant