fix(workflow_engine): Add a cache for Workflows to reduce DB load #106925

saponifi3d · 2026-01-23T22:57:48Z

Description

We select workflows from the DB very frequently. This has added substantial load to our DB, even though the query is very fast / efficient.

This PR introduces a caching layer for this high frequency db query.

…che if a relationship to the detector or environment changes.

saponifi3d · 2026-01-23T23:44:13Z

src/sentry/workflow_engine/caches/workflow.py

+
+    if workflows is None:
+        environment_filter = (
+            (Q(environment_id=None) | Q(environment_id=environment.id))


🤔 -- this line from the original query will likely cause our cache size to be a lot larger than needed. we could normalize this by having the "env filter = None" results as 1 cached query, and by environment as another.

Would like to see the impact of this / the size of the cache before diving into changing functionality of this query though.

…dded tests for the workflow

… processing on changes

kcons · 2026-01-29T01:18:49Z

src/sentry/workflow_engine/caches/workflow.py

+            .distinct()
+        )
+
+        cache.set(cache_key, workflows, timeout=CACHE_TTL)


I'm curious what our observability of caching is here.
I know in traces one type of cache (django? is this django cache or only sometimes?) doesn't show up, and that's been a bit of a pain for debugging.

Also, it'd be nice if we could have counters for hit/miss so we can brag about how many queries we're avoiding.

yeah, i kinda purposefully was avoiding obs / counters thus far. 😅

did you have any specific obs in mind? i'm thinking a metric for cache hit / miss / invalidation.

🤔 maybe debug logs for cache miss and when we invalidate? (thinking a stack trace might be handy with signals. could at least see which models are causing invalidations etc)

kcons · 2026-01-29T01:26:21Z

src/sentry/workflow_engine/caches/workflow.py

+    This method uses a read-through cache, and returns which workflows to evaluate.
+    """
+    env_id = environment.id if environment is not None else None
+    cache_key = processing_workflow_cache_key(detector.id, env_id)


if you like barely justfied abstractions, we have CacheAccess[T] thing.
The idea is that you define a subclass like

class _ProcessingWorkflowCacheAccess(CacheAcess[set[Workflow]]): def __init__(self, ..., ttl=DEFAULT_TTL) -> None: # verify params, save key def key(self) -> str: return self._key ... cache_access = _ProcessingWorkflowCacheAccess(detector, environment) workflows = cache_access.get() .. cache_access.set(workflows)

Not game changing, but this was after we were using the wrong key in one place and had some wrong type assumptions about cached values, so it seemed appropriate to try for an abstraction that ensures consistent key use and type safety.

(it doesn't have delete, but it should).

👍 -- i like it. i was thinking of something similar tbh 🤣 i always fear text based keys.

Christinarlong · 2026-01-29T18:18:02Z

src/sentry/workflow_engine/models/signals/detector.py

+from sentry.workflow_engine.models import Detector
+
+
+@receiver(post_save, sender=Detector)


Q: why did we end up going with post_save signals on detector? Is it cause the lack of SOPA?

Christinarlong · 2026-01-29T18:19:07Z

src/sentry/workflow_engine/models/signals/detector_workflow.py

+from sentry.workflow_engine.models import DetectorWorkflow
+
+
+@receiver(post_migrate, sender=DetectorWorkflow)


Q: why are these signals on post_migrate and pre_save for invalidation?

Christinarlong · 2026-01-29T18:44:17Z

src/sentry/workflow_engine/caches/workflow.py

+    if detector_id is None:
+        detector_id = "*"


Q: This part is prob still in progress but I'm confused on why we're putting a wildcard here since it doesn't look like we ever set one in the initial cache population? Also we should prob add a comment or seperate out the function to be so that if both detector_id + env_id being None = clear everything.

I guess is it possible only one of the params would be None, and if so when would that happen?

Add a cache for workflows that are being processed. Invalidate the ca…

aa94edd

…che if a relationship to the detector or environment changes.

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 23, 2026

vercel bot deployed to Preview January 23, 2026 22:58 View deployment

saponifi3d added 2 commits January 23, 2026 15:22

WIP

edbd0c4

fix types for the cache methods

a2f4a7a

vercel bot deployed to Preview January 23, 2026 23:44 View deployment

saponifi3d commented Jan 23, 2026

View reviewed changes

saponifi3d added 3 commits January 28, 2026 10:58

Starting to move signals into their own area

ecc3b96

moved signals out of the workflow model to reduce circular imports. a…

256f078

…dded tests for the workflow

add a ruh roh if we try to invalidate the cache as a whole right now

0043270

vercel bot deployed to Preview January 28, 2026 23:23 View deployment

saponifi3d added 2 commits January 28, 2026 16:35

Update the detector signals, and added a signal to clear the workflow…

8c17682

… processing on changes

remove old todos

b9ed2ca

vercel bot deployed to Preview January 29, 2026 00:38 View deployment

left some breadcrumbs for tomorrow

03e29fa

vercel bot deployed to Preview January 29, 2026 00:55 View deployment

WIP

f6fae94

vercel bot deployed to Preview January 29, 2026 01:25 View deployment

kcons reviewed Jan 29, 2026

View reviewed changes

Add cache invalidation for DetectorWorkflow changes

042851d

vercel bot deployed to Preview January 29, 2026 04:28 View deployment

add metrics to the cache

19898de

vercel bot deployed to Preview January 29, 2026 04:52 View deployment

mypy errors

40c4116

vercel bot deployed to Preview January 29, 2026 05:02 View deployment

fix most of the BE tests where signals were missing

115df9c

vercel bot deployed to Preview January 29, 2026 05:13 View deployment

Christinarlong reviewed Jan 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(workflow_engine): Add a cache for Workflows to reduce DB load #106925

fix(workflow_engine): Add a cache for Workflows to reduce DB load #106925

Uh oh!

saponifi3d commented Jan 23, 2026

saponifi3d Jan 23, 2026

kcons Jan 29, 2026

saponifi3d Jan 29, 2026 •

edited

Loading

kcons Jan 29, 2026

saponifi3d Jan 29, 2026 •

edited

Loading

Christinarlong Jan 29, 2026

Christinarlong Jan 29, 2026

Christinarlong Jan 29, 2026

Labels

4 participants

		from sentry.workflow_engine.models import Detector


		@receiver(post_save, sender=Detector)

		from sentry.workflow_engine.models import DetectorWorkflow


		@receiver(post_migrate, sender=DetectorWorkflow)

Uh oh!

fix(workflow_engine): Add a cache for Workflows to reduce DB load #106925

Are you sure you want to change the base?

fix(workflow_engine): Add a cache for Workflows to reduce DB load #106925

Uh oh!

Conversation

saponifi3d commented Jan 23, 2026

Description

saponifi3d Jan 23, 2026

Choose a reason for hiding this comment

kcons Jan 29, 2026

Choose a reason for hiding this comment

saponifi3d Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

kcons Jan 29, 2026

Choose a reason for hiding this comment

saponifi3d Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Christinarlong Jan 29, 2026

Choose a reason for hiding this comment

Christinarlong Jan 29, 2026

Choose a reason for hiding this comment

Christinarlong Jan 29, 2026

Choose a reason for hiding this comment

Labels

4 participants

saponifi3d Jan 29, 2026 •

edited

Loading

saponifi3d Jan 29, 2026 •

edited

Loading