Skip to content

fix(jax): structural defense against cached_property pytree/dict leaks#1302

Merged
Jammy2211 merged 1 commit into
mainfrom
feature/cached-property-pytree-guard
May 29, 2026
Merged

fix(jax): structural defense against cached_property pytree/dict leaks#1302
Jammy2211 merged 1 commit into
mainfrom
feature/cached-property-pytree-guard

Conversation

@Jammy2211

Copy link
Copy Markdown
Collaborator

Summary

Follow-up to #1300. The previous PR fixed a specific leak; this PR fixes the class of bug: every __dict__ walker on the model side uses an opt-out filter (underscore prefix + hardcoded blacklist), so any future @cached_property on a model class silently reproduces the same 38-script JAX failure mode.

Six sites in autofit/mapper/ are extended with a frozenset union of the class's cached_property names (sourced from PyAutoLabs/PyAutoConf#111's new cached_property_names helper):

File:line Site
autofit/mapper/model.py:86 AbstractModel.__getstate__
autofit/mapper/model.py:467 ModelInstance.dict
autofit/mapper/model_object.py:332 ModelObject._dict (feeds Collection.items)
autofit/mapper/prior_model/abstract.py:1280 AbstractModel.items
autofit/mapper/prior_model/collection.py:289 Collection._instance_for_arguments
autofit/mapper/prior_model/prior_model.py:497 Model._instance_for_arguments

A new classmethod AbstractModel._cached_property_names() delegates to the autoconf helper so the filter sites all read from a single source of truth.

Identifier-hash stability — verified

Critical for release: the unique_identifier hash (search output directory name, fit resume key, DB lookup) must not shift. Verified:

  1. The identifier (autofit/mapper/identifier.py:66-159) walks __dict__ independently — it does NOT call any of the 6 sites modified here.
  2. The identifier's own walker already filters _-prefix keys at line 137.
  3. Post-fix(jax): keep parameterization cache off ModelInstance + auto-register pytrees #1300, zero live @cached_property exists on any AbstractModel descendant in PyAutoFit/PyAutoArray/PyAutoGalaxy/PyAutoLens. The defense is purely forward-compat.
  4. Stress-test on three representative model shapes confirms byte-identical hashes pre/post defense:
    • Collection(gaussian=Model(Gaussian))f7f19073...
    • Nested Collection(a=Model, b=Collection(nested=Model))04e1328c...
    • Collection(Model(Gaussian, centre=(0.0, 0.0)))36084d2c...

Test plan

  • pytest test_autofit/mapper/test_parameterization.py -v — 13/13 pass (2 new tests + 11 existing).
  • pytest test_autofit -q — 1415/1415 pass (1413 prior + 2 new), 1 skipped (no regressions).
  • Identifier stability stress test on 3 model shapes — byte-identical strings pre/post.

Dependency

Depends on PyAutoLabs/PyAutoConf#111 (must merge first — provides the cached_property_names MRO walker).

A paired PyAutoArray PR ships the same defense on the pytree flatten side: PyAutoLabs/PyAutoArray#TBD.

🤖 Generated with Claude Code

PR #1300 fixed a specific leak where AbstractPriorModel.parameterization
(a `@functools.cached_property` added in commit 4564ae9) leaked its
cached string into every ModelInstance via
Collection._instance_for_arguments. That broke 38 JAX jit(fit_from)
calls and the autofit_workspace overview_1 smoke (clusters C1+C4).

The minimal fix renamed the cache key to `_parameterization_cache` so
the existing `_`-prefix filter at each `__dict__` iterator skipped it.
The structural problem remained: every walker uses an opt-out filter
(blacklist + underscore prefix), so any future cached_property declared
on a model class silently reproduces the same class of bug.

This PR closes the class:

- New classmethod `AbstractModel._cached_property_names(cls)` delegates
  to `autoconf.tools.decorators.cached_property_names` (PyAutoConf #111),
  returning a frozenset of every functools.cached_property and autoconf
  CachedProperty descriptor name in the MRO.
- Extend the filter at every `__dict__` iteration site to union the
  pre-existing exclusion with this frozenset:
    autofit/mapper/model.py            (__getstate__, ModelInstance.dict)
    autofit/mapper/model_object.py     (ModelObject._dict — feeds Collection.items)
    autofit/mapper/prior_model/abstract.py        (AbstractModel.items)
    autofit/mapper/prior_model/collection.py      (Collection._instance_for_arguments)
    autofit/mapper/prior_model/prior_model.py     (Model._instance_for_arguments)

Identifier-hash stability verified: the unique_identifier walker at
`autofit/mapper/identifier.py` does NOT call any of these 6 sites — it
walks `__dict__` independently with its own `_`-prefix filter. Three
representative model shapes (simple Collection, nested Collection,
Model with tuple arg) all produce byte-identical identifier hashes
pre- and post-defense:
  simple:     f7f19073a8fb19b3d11231fb6eef7e3b ✓
  nested:     04e1328c84a1e4c3a81a9d3544dd19f5 ✓
  with_tuple: 36084d2c3fec27e0b7aa504add0bd898 ✓

Tests:
- `test_cached_property_names_classmethod_walks_mro`: confirms the
  classmethod surfaces MRO-declared descriptors and memoises per-class.
- `test_cached_property_excluded_from_all_dict_walks`: ships a synthetic
  GuardedCollection with a cached_property returning a string; asserts
  the value never appears in instance.__dict__, instance.dict,
  model.items(), tree_flatten() leaves, __getstate__, or pickle
  round-trip.
- 1415/1415 PyAutoFit tests pass (1413 prior + 2 new).

Depends on: PyAutoLabs/PyAutoConf#111.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Jammy2211 Jammy2211 merged commit 76d6aef into main May 29, 2026
5 of 7 checks passed
@Jammy2211 Jammy2211 deleted the feature/cached-property-pytree-guard branch May 29, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant