Skip to content

Integration tests flaky: concurrent Astro dev servers collide on shared fixture lock ("Another astro dev server is already running") #1604

Description

@MA2153

Summary

The Integration Tests CI job (vitest run --config vitest.integration.config.ts) fails intermittently with:

Error: Server at http://localhost:4397/_emdash/api/setup/dev-bypass did not start within 90000ms

Server output (last 5000 chars):
Another astro dev server is already running.

  URL:  http://localhost:4398
  PID:  3512

Run `astro dev stop` to stop it, or use `astro dev --force` to replace it.

This is an infra/harness flake, not a product or test-logic bug. In the failing run, 61 tests passed and only 2 suites (field-widgets.test.ts, oauth-discovery.test.ts) failed — both for the identical "already running" reason.

Seen on PR #1535job log.

Root cause

Each integration suite boots a real Astro dev server via createTestServer() in packages/core/tests/integration/server.ts. Two facts combine to cause the race:

  1. All suites share one fixture directory. The server runs the fixture in-placeconst workDir = FIXTURE_DIR; ... spawn(astroBin, ["dev", ...], { cwd: workDir }) (server.ts:160, 181). The comment explains the in-place choice avoids Astro virtual-module resolution issues with symlinked temp dirs. Only the DB/uploads live in a temp dir.

  2. Vitest runs test files in parallel by default. vitest.integration.config.ts sets no pool/fileParallelism options, so each *.test.ts runs in its own worker concurrently.

Astro's "another dev server is already running" guard is scoped to the project directory (a lockfile under the shared node_modules/.astro), not to the port. So even though each suite picks a distinct hardcoded port:

Suite Port
comments.test.ts 4396
field-widgets.test.ts 4397
cli.test.ts 4398
client.test.ts 4399
oauth-discovery.test.ts 4401

…the second server to start from the same fixture dir aborts immediately because another is already holding the lock. The tell: the suite requested port 4397 but the error reports a conflicting server at 4398 (a different suite's port). waitForServer then spins for the full 90s and the suite fails.

It's timing-dependent (whoever wins the lock first survives), which is why it only flakes sometimes.

Suggested fixes (pick one)

  1. Run integration suites serially — set poolOptions/fileParallelism: false (or pool: "forks", singleThread) in vitest.integration.config.ts. Simplest; costs wall-clock time since suites already boot a server each in beforeAll.
  2. Give each suite its own fixture copy — copy FIXTURE_DIR into a per-suite temp dir (with its own node_modules symlink and .astro state) so the Astro lock no longer collides. Preserves parallelism.
  3. Isolate Astro's state dir per server — point each spawned server at a distinct cache/state location so the lock is per-process.

Option 2 keeps parallelism and is closest to the harness's existing per-suite temp-dir pattern (it already does this for the DB and uploads).

Acceptance

  • Integration Tests job passes reliably across repeated runs.
  • Add a reproducing condition (e.g. run the suites with parallelism forced on) so the fix is verifiable per the repo's TDD-for-bugs rule.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions