Summary
The Integration Tests CI job (vitest run --config vitest.integration.config.ts) fails intermittently with:
Error: Server at http://localhost:4397/_emdash/api/setup/dev-bypass did not start within 90000ms
Server output (last 5000 chars):
Another astro dev server is already running.
URL: http://localhost:4398
PID: 3512
Run `astro dev stop` to stop it, or use `astro dev --force` to replace it.
This is an infra/harness flake, not a product or test-logic bug. In the failing run, 61 tests passed and only 2 suites (field-widgets.test.ts, oauth-discovery.test.ts) failed — both for the identical "already running" reason.
Seen on PR #1535 — job log.
Root cause
Each integration suite boots a real Astro dev server via createTestServer() in packages/core/tests/integration/server.ts. Two facts combine to cause the race:
-
All suites share one fixture directory. The server runs the fixture in-place — const workDir = FIXTURE_DIR; ... spawn(astroBin, ["dev", ...], { cwd: workDir }) (server.ts:160, 181). The comment explains the in-place choice avoids Astro virtual-module resolution issues with symlinked temp dirs. Only the DB/uploads live in a temp dir.
-
Vitest runs test files in parallel by default. vitest.integration.config.ts sets no pool/fileParallelism options, so each *.test.ts runs in its own worker concurrently.
Astro's "another dev server is already running" guard is scoped to the project directory (a lockfile under the shared node_modules/.astro), not to the port. So even though each suite picks a distinct hardcoded port:
| Suite |
Port |
comments.test.ts |
4396 |
field-widgets.test.ts |
4397 |
cli.test.ts |
4398 |
client.test.ts |
4399 |
oauth-discovery.test.ts |
4401 |
…the second server to start from the same fixture dir aborts immediately because another is already holding the lock. The tell: the suite requested port 4397 but the error reports a conflicting server at 4398 (a different suite's port). waitForServer then spins for the full 90s and the suite fails.
It's timing-dependent (whoever wins the lock first survives), which is why it only flakes sometimes.
Suggested fixes (pick one)
- Run integration suites serially — set
poolOptions/fileParallelism: false (or pool: "forks", singleThread) in vitest.integration.config.ts. Simplest; costs wall-clock time since suites already boot a server each in beforeAll.
- Give each suite its own fixture copy — copy
FIXTURE_DIR into a per-suite temp dir (with its own node_modules symlink and .astro state) so the Astro lock no longer collides. Preserves parallelism.
- Isolate Astro's state dir per server — point each spawned server at a distinct cache/state location so the lock is per-process.
Option 2 keeps parallelism and is closest to the harness's existing per-suite temp-dir pattern (it already does this for the DB and uploads).
Acceptance
- Integration Tests job passes reliably across repeated runs.
- Add a reproducing condition (e.g. run the suites with parallelism forced on) so the fix is verifiable per the repo's TDD-for-bugs rule.
Summary
The Integration Tests CI job (
vitest run --config vitest.integration.config.ts) fails intermittently with:This is an infra/harness flake, not a product or test-logic bug. In the failing run, 61 tests passed and only 2 suites (
field-widgets.test.ts,oauth-discovery.test.ts) failed — both for the identical "already running" reason.Seen on PR #1535 — job log.
Root cause
Each integration suite boots a real Astro dev server via
createTestServer()inpackages/core/tests/integration/server.ts. Two facts combine to cause the race:All suites share one fixture directory. The server runs the fixture in-place —
const workDir = FIXTURE_DIR; ... spawn(astroBin, ["dev", ...], { cwd: workDir })(server.ts:160, 181). The comment explains the in-place choice avoids Astro virtual-module resolution issues with symlinked temp dirs. Only the DB/uploads live in a temp dir.Vitest runs test files in parallel by default.
vitest.integration.config.tssets nopool/fileParallelismoptions, so each*.test.tsruns in its own worker concurrently.Astro's "another dev server is already running" guard is scoped to the project directory (a lockfile under the shared
node_modules/.astro), not to the port. So even though each suite picks a distinct hardcoded port:comments.test.tsfield-widgets.test.tscli.test.tsclient.test.tsoauth-discovery.test.ts…the second server to start from the same fixture dir aborts immediately because another is already holding the lock. The tell: the suite requested port 4397 but the error reports a conflicting server at 4398 (a different suite's port).
waitForServerthen spins for the full 90s and the suite fails.It's timing-dependent (whoever wins the lock first survives), which is why it only flakes sometimes.
Suggested fixes (pick one)
poolOptions/fileParallelism: false(orpool: "forks", singleThread) invitest.integration.config.ts. Simplest; costs wall-clock time since suites already boot a server each inbeforeAll.FIXTURE_DIRinto a per-suite temp dir (with its ownnode_modulessymlink and.astrostate) so the Astro lock no longer collides. Preserves parallelism.Option 2 keeps parallelism and is closest to the harness's existing per-suite temp-dir pattern (it already does this for the DB and uploads).
Acceptance