perf: avoid throw-as-control-flow in SessionManager hot path (#7756) by JohnMcLear · Pull Request #7775 · ether/etherpad

JohnMcLear · 2026-05-16T07:48:57Z

Summary

CPU profile of the SUT at the 100→400 author dive sweep (load-test workflow run 25956384097) attributed ~6% of total process CPU to the throw + catch around getSessionInfo:

~1.82% to new CustomError('sessionID does not exist', 'apierror') (stack-trace capture)
~4.12% downstream, via the catch block's console.debug(...) routed through log4js → sendToListeners → sendLogEventToAppender

Both internal callers (findAuthorID on every CLIENT_READY, listSessionsWithDBKey on session listing) caught apierror and discarded it. The public exports.getSessionInfo still has to throw for the HTTP API (returns code: 1 for missing sessionID), so this PR introduces a private getSessionInfoOrNull helper that returns null and switches the two hot-path callers to it. exports.getSessionInfo becomes a thin wrapper that preserves the throw semantics.

Profile evidence

Inverted callers of CustomError constructor in the profile:

 1.82% exports.getSessionInfo [src/node/db/SessionManager.ts]

Inverted (non-log4js) callers of log4js Logger.<computed>:

 4.12% exports.checkAccess [src/node/db/SecurityManager.ts]

(checkAccess is the entry point that drives findAuthorID → getSessionInfo → console.debug.)

Behavior

No behavior change for the HTTP API. getSessionInfo still throws `apierror` when the session doesn't exist; RestAPI.ts/APIHandler.ts translate that to code: 1.
The two internal callers preserve their existing semantics: findAuthorID returns undefined for unknown sessions, listSessionsWithDBKey continues to log a warning and set sessions[sessionID] = null.

Test plan

`pnpm exec mocha tests/backend/specs/api/sessionsAndGroups.ts` — all 32 cases pass, including "getSessionInfo of deleted session" (still expects code:1).
Re-run the dive sweep with this branch as core_ref and confirm steady-state CPU% drops at the cliff (300-400 authors).

Part of #7756. Profile capture pipeline is in etherpad-load-test#109/110/111.

CPU profile of the SUT at the 100-400 author dive sweep (load-test workflow run 25956384097) attributed about 6% of total process CPU to the throw + catch around getSessionInfo: - ~1.82% to `new CustomError('sessionID does not exist', 'apierror')` construction (stack trace capture) - ~4.12% downstream, via the catch block's `console.debug(...)` routed through log4js -> sendToListeners -> sendLogEventToAppender Both call sites (`findAuthorID` on every CLIENT_READY, and `listSessionsWithDBKey` on session listing) immediately caught `apierror` and discarded it. The public `exports.getSessionInfo` contract still has to throw for the HTTP API (returning code:1 for missing sessionID), so introduce a private `getSessionInfoOrNull` helper that returns null and have the hot-path callers use it directly. `exports.getSessionInfo` is kept as a thin wrapper that preserves the existing throw semantics. No behaviour change for the HTTP API — sessionsAndGroups.ts test file (32 cases, including "getSessionInfo of deleted session" expecting code:1) passes unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

qodo-code-review · 2026-05-16T07:49:01Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

qodo-free-for-open-source-projects · 2026-05-16T07:49:10Z

Review Summary by Qodo

Avoid throw-as-control-flow in SessionManager hot path

✨ Enhancement

Walkthroughs

Description

• Eliminate throw-as-control-flow in hot-path callers of getSessionInfo
• Introduce private getSessionInfoOrNull helper returning null
• Refactor findAuthorID and listSessionsWithDBKey to use null checks
• Reduce CPU overhead by ~6% (1.8% CustomError construction + 4% logging)

Diagram

flowchart LR
  A["getSessionInfoOrNull<br/>returns null"] --> B["findAuthorID<br/>null check"]
  A --> C["listSessionsWithDBKey<br/>null check"]
  D["exports.getSessionInfo<br/>wrapper"] --> E["throws apierror<br/>for HTTP API"]
  A --> D

File Changes

1. src/node/db/SessionManager.ts Performance optimization +20/-22

Replace throw-catch with null-check pattern

• Introduce private getSessionInfoOrNull helper that returns null instead of throwing
• Refactor findAuthorID to use null-returning helper with simple || check
• Refactor listSessionsWithDBKey to use null-returning helper with explicit null check
• Keep exports.getSessionInfo as thin wrapper preserving throw semantics for HTTP API

src/node/db/SessionManager.ts

qodo-free-for-open-source-projects · 2026-05-16T07:49:11Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

CPU profile of develop at the 100-400 author dive sweep (load-test run 25956384097) identified a ~6% process-CPU win in SessionManager: throw-as-control-flow on every CLIENT_READY session lookup. Add lever 9 section with the profile evidence, link the open PR (#7775), and add a "Other CPU hotspots surfaced" subsection documenting findings not yet acted on (Changeset internals, appendRevision, ueberdb/dirty backing as test-harness artifact, esbuild __name overhead). Update Recommendation to include #7775 as the highest-priority merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the "score pending" placeholder under lever 9 with the actual numbers from runs 25957107195/25957108328/25957109418 (perf branch) vs 25954537767/25954538807/25954540108 (develop), both at authors=100..500:step=50:dwell=8s:warmup=2s. Result: consistent -1.4% to -5.3% CPU reduction across all 9 steps, matching profile direction at 2-5% (vs 6% profile-attributed upper bound). Latency delta sits inside the noise envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear · 2026-05-16T08:34:47Z

N=3 measured impact

Three perf-branch runs (25957107195, 25957108328, 25957109418) compared against three develop baselines (25954537767, 25954538807, 25954540108) — same authors=100..500:step=50:dwell=8s:warmup=2s sweep on each.

Medians:

step	dev CPU%	perf CPU%	ΔCPU%
100	4.76	4.67	-1.7%
150	9.09	8.95	-1.5%
200	15.21	14.60	-4.0%
250	21.51	21.32	-0.9%
300	30.46	29.68	-2.6%
350	41.58	39.36	-5.3%
400	56.26	54.23	-3.6%
450	72.33	70.49	-2.5%
500	88.38	87.14	-1.4%

ΔCPU% is consistently negative (-1.4% to -5.3%) across all 9 steps. The realised magnitude is below the profile-attributed 6% upper bound because some of the log4js cost the profile attributed to the throw path was unrelated info logging — but the direction is exactly what the profile predicted.

Latency delta sits inside the noise envelope across all steps (raw p95 triples overlap heavily run-to-run).

Three combined-branch runs (perf/dive-combined = #7776 cherry-picked onto #7775 base; runs 25960003164/25960004223/25960005248) vs the same three develop baselines: -12% to -20% CPU% across all 9 sweep steps, with the p95 cliff effectively moving from ~400 to ~500 authors (at step 400, two of three combined runs land below the cliff at 45ms and 112ms p95 vs develop [1758, 2275, 2463]). Adds: - Lever 10 section for #7776 with its own N=3 numbers (-3.6 to -8.9% alone). - "Stacking" section showing super-additive interaction. - Local vCPU experiment showing the cliff is single-event-loop-bound, not total-CPU-bound: 4-core and 8-core pinned SUTs hit the same cliff at the same step. - Updated TL;DR, Recommendation order (merge both #7775+#7776 first), and "Where to take this next" with worker-thread offload as the smallest next architectural step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear · 2026-05-16T11:04:39Z

#7775 + #7776 stacked — N=3 combined measurement

I cherry-picked #7775 onto #7776 (branch perf/dive-combined) and ran 3 cliff sweeps (25960003164, 25960004223, 25960005248) against the same 3 develop baselines used for the individual PR scoring. The two fixes stack super-additively:

step	dev CPU%	#7775 alone	#7776 alone	both	Δboth
100	4.76	4.67	4.51	3.99	-16.1%
200	15.21	14.60	14.33	12.48	-17.9%
300	30.46	29.68	28.50	24.39	-19.9%
350	41.58	39.36	37.87	33.04	-20.5%
400	56.26	54.23	53.67	44.78	-20.4%
450	72.33	70.49	68.80	61.18	-15.4%
500	88.38	87.14	85.17	77.70	-12.1%

The stacked impact (-12% to -20% CPU%) is well above the sum of the individual gains. Both fixes remove call sites feeding the same log4js cluster-mode dispatch chain (sendToListeners → sendLogEventToAppender); halving the LogEvent allocation rate appears to relieve queue/GC pressure beyond what either fix accounts for in isolation.

Latency: cliff has effectively moved past step 400. Raw p95 triples:

step	develop p95 [3 runs]	combined p95 [3 runs]
400	[1758, 2275, 2463]	[45, 112, 634]
450	[5415, 6167, 6611]	[3297, 3719, 3897] (-40%)
500	[10655, 11759, 12183]	[8091, 8711, 9127] (-26%)

At step 400, two of three combined runs land below the cliff entirely. Recommend landing these two together to capture the full effect.

Post-#7775/#7776 profile shows applyToAText splits cleanly: - applyToText (Changeset.ts:404) is pure (cs, text) -> text; trivially offloadable to a worker via worker_threads structured-clone postMessage. - applyToAttribution (Changeset.ts:684) mutates AttributePool; not trivially offloadable. Document the obvious first-pass design (run them in parallel via Promise.all inside applyToAText) and the realistic estimate (~6-8% CPU moved off the main event loop). putAttrib is only 0.26% in the post-fix profile, confirming the bulk of applyToAText's cost is in the string-manipulation half. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a "Roadmap for future effort" section ahead of Reproducing, ranking the next concrete options by impact-per-time-spent. Tier 1 (mechanical / <1 day each): - merge ready perf PRs (#7775+#7776+#7774) - implement #7780 room-broadcast fan-out - additional post-fix profile pass Tier 2 (medium, real cliff moves): - selective fan-out / viewport-based broadcast (~2 weeks; cliff ~500 → 1000-1500) - per-pad worker isolation PoC (~1-2 weeks PoC, 1-2 months prod) Tier 3 (large bets): - sticky-session cluster mode (~2-4 weeks PoC) - CRDT migration (months; anti-recommended) Tier 4 (operational): - production telemetry hookup (~3-5 days) - nightly dive in CI (~1 day) Records the recommended sequence (Tier 1.2 → Tier 2.4) so the next person picking this up doesn't need to re-derive it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear mentioned this pull request May 16, 2026

perf: don't log settings.loadTest warning per-message (#7756) #7776

Draft

3 tasks

JohnMcLear mentioned this pull request May 16, 2026

Performance: [#7756 follow-up] Room-broadcast NEW_CHANGES fan-out to drop ~5-7% per-recipient packet construction #7780

Open

JohnMcLear marked this pull request as draft May 18, 2026 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf: avoid throw-as-control-flow in SessionManager hot path (#7756)#7775

perf: avoid throw-as-control-flow in SessionManager hot path (#7756)#7775
JohnMcLear wants to merge 1 commit into
developfrom
perf/session-getinfo-no-throw

JohnMcLear commented May 16, 2026

qodo-code-review Bot commented May 16, 2026

qodo-free-for-open-source-projects Bot commented May 16, 2026

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading

JohnMcLear commented May 16, 2026

JohnMcLear commented May 16, 2026

Labels

1 participant

Uh oh!

Uh oh!

Conversation

JohnMcLear commented May 16, 2026

Summary

Profile evidence

Behavior

Test plan

qodo-code-review Bot commented May 16, 2026

Qodo reviews are paused for this user.

qodo-free-for-open-source-projects Bot commented May 16, 2026

Review Summary by Qodo

Walkthroughs

File Changes

qodo-free-for-open-source-projects Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Great, no issues found!

JohnMcLear commented May 16, 2026

N=3 measured impact

JohnMcLear commented May 16, 2026

#7775 + #7776 stacked — N=3 combined measurement

Labels

1 participant

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading