Skip to content

feat(janitor): add scoped snapshot cleanup for invalidated environments#5856

Open
mday-io wants to merge 9 commits into
SQLMesh:mainfrom
mday-io:feat/invalidate-cleanup-snapshots
Open

feat(janitor): add scoped snapshot cleanup for invalidated environments#5856
mday-io wants to merge 9 commits into
SQLMesh:mainfrom
mday-io:feat/invalidate-cleanup-snapshots

Conversation

@mday-io

@mday-io mday-io commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

Closes #5844

Adds scoped snapshot cleanup for invalidated environments.

sqlmesh invalidate --cleanup-snapshots now invalidates the target environment and then runs scoped janitor cleanup for only unreferenced snapshots formerly referenced by that environment. This avoids the previous workaround of running sqlmesh janitor --ignore-ttl, which performs project-wide cleanup of all unreferenced snapshots.

The change also makes the scoped cleanup behavior available directly through:

sqlmesh janitor --environment dev_my_pr_env --ignore-ttl

This is particularly useful for CI/CD workflows where PR environments are frequently created and invalidated, but cleanup should not remove unrelated unreferenced snapshots from other invalidated environments.

How it works

  1. Captures the expired/invalidated environment's snapshot IDs before deleting the environment state.
  2. Uses get_expired_snapshots(..., target_snapshot_ids=...) to restrict candidate snapshots to that captured set.
  3. Still excludes any snapshot referenced by another active/finalized environment, such as prod.
  4. Emits a warning before scoped physical snapshot table cleanup.
  5. Drops physical tables through SnapshotEvaluator.cleanup().
  6. Deletes snapshot state through the normal janitor/state-sync expired snapshot path, so interval cleanup still runs with the same semantics as regular janitor cleanup.

Examples

sqlmesh invalidate dev_my_pr_env --cleanup-snapshots
sqlmesh janitor --environment dev_my_pr_env --ignore-ttl

Checklist

  • I have run make style and fixed any issues
  • I have added tests for my changes
  • All existing tests pass (make fast-test)
  • My commits are signed off (git commit -s) per the DCO

Additional validation

  • pytest tests/core/state_sync/test_state_sync.py -q — 92 passed
  • pytest tests/core/integration/test_aux_commands.py -q — 13 passed
mday-io added 2 commits June 19, 2026 10:53
In dbt 1.6/1.7 CI environments, snowflake-connector-python resolves to
versions 3.0-3.7.x which require pyOpenSSL<24.0.0. When combined with
cryptography>=42.0 (which removed the lib.GEN_EMAIL constant), importing
the Snowflake connector raises AttributeError: module 'lib' has no
attribute 'GEN_EMAIL', failing these tests:

  - test_snowflake_config (via SnowflakeConnectionConfig._validate_authenticator)
  - test_api_class_loading[snowflake] (via SnowflakeConfig.relation_class)

pyOpenSSL>=24.0.0 forces pip/uv to resolve snowflake-connector-python to
3.8.0+ (which allows pyOpenSSL<25.0.0, thus including 24.x). pyOpenSSL
24.0.0 fixed the GEN_EMAIL incompatibility with cryptography>=42.0.

dbt 1.3-1.5 and 1.8-1.10 are unaffected: earlier versions resolve
connector packages that avoid this code path; later versions of the
connector already widened cryptography's upper bound.

Also collapses a multi-line docstring in classproperty to a single line.

Signed-off-by: mday-io <mdaytn@gmail.com>
@mday-io mday-io marked this pull request as draft June 21, 2026 04:11
mday-io added 2 commits June 21, 2026 00:12
…eanup

Adds a `--cleanup-snapshots` flag to `sqlmesh invalidate` that immediately
deletes physical snapshot tables exclusively owned by the target environment,
without affecting snapshots shared with other environments (e.g. prod).

Previously, users had to run `sqlmesh janitor --ignore-ttl` separately after
invalidating, which performed a global cleanup across all environments. The
new flag provides a scoped alternative that:

1. Captures the environment's snapshot IDs before invalidation
2. Filters to only those not referenced by any other active environment
3. Drops the physical tables and removes the state records for those snapshots

Changes:
- cli/main.py: add --cleanup-snapshots flag to the invalidate command
- core/context.py: pass cleanup_snapshots through to invalidate_environment
- core/janitor.py: add delete_snapshots_for_environment() helper function
- core/state_sync/base.py: add target_snapshot_ids param to get/delete_expired_snapshots
- core/state_sync/db/facade.py: thread target_snapshot_ids through facade
- core/state_sync/db/snapshot.py: filter expired query by target_snapshot_ids when provided
- core/state_sync/cache.py: add target_snapshot_ids param to CachingStateSync

Closes SQLMesh#5844

Signed-off-by: mday-io <mdaytn@gmail.com>
- C1/M4: eliminate TOCTOU race in delete_snapshots_for_environment by
  calling state_sync.delete_snapshots(batch.expired_snapshot_ids) directly
  instead of re-querying via delete_expired_snapshots, so physical drops
  and state removal operate on the same snapshot ID set
- M1: remove always-truthy `if target_conditions:` guard in
  get_expired_snapshots (snapshot_id_filter always yields >= 1 condition)
- M2: when cleanup_snapshots=True and the environment does not exist, log
  a warning and return early instead of printing a misleading success message
- m1: unconditionally initialize target_snapshot_ids before the
  cleanup_snapshots block to prevent potential UnboundLocalError
- n1: enforce `sync = sync or cleanup_snapshots` explicitly so the
  implication is in code, not just docs; update docstring and CLI help
  to say "cleanup runs synchronously" instead of "Implies --sync"

Signed-off-by: mday-io <mdaytn@gmail.com>
@mday-io mday-io force-pushed the feat/invalidate-cleanup-snapshots branch from ae27f17 to 5c9d308 Compare June 21, 2026 04:13
@mday-io mday-io marked this pull request as ready for review July 1, 2026 18:08
@mday-io mday-io changed the title Feat/invalidate cleanup snapshots Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant