-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
feat(aci): deduplicate workflows API endpoint #106482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
| ).delete() | ||
|
|
||
| # Now delete the duplicate workflows | ||
| Workflow.objects.filter(id__in=workflow_ids).delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WorkflowFireHistory records lost during workflow deduplication
Medium Severity
The code updates DetectorWorkflow and AlertRuleWorkflow references to point to the canonical workflow before deleting duplicates, preserving those associations. However, WorkflowFireHistory (which tracks when workflows fired) is not updated. When duplicate workflows are deleted, their WorkflowFireHistory entries are cascade-deleted due to the FK's default on_delete=CASCADE, permanently losing historical fire data.
saponifi3d
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the biggest thing to investigate / fix would be the transaction failing on a constraint violation.
| raise | ||
|
|
||
| def test_deduplication__multiple_triggers(self) -> None: | ||
| self.run_configuration(idx=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭 i wish parameterize worked.
| def deduplicate_workflows(organization: Organization): | ||
| workflows = ( | ||
| Workflow.objects.filter(organization=organization) | ||
| .exclude(detectorworkflow__detector__type="error") # grouping.grouptype ErrorGroupType.slug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 should we exclude error workflows? (i'd expect them to still work the same way, just connected to more detectors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were excluded in the original migration, but I don't see any issues with including them!
src/sentry/workflow_engine/endpoints/organization_deduplicate_workflows.py
Outdated
Show resolved
Hide resolved
| Action.objects.filter( | ||
| dataconditiongroupaction__condition_group__workflowdataconditiongroup__workflow_id__in=workflow_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a todo for myself from this is to improve the Workflow and Action models [ticket]; it feels like we should be able to select all the actions associated with a workflow, and all the workflows an action is associated with more easily.
|
|
||
| @region_silo_endpoint | ||
| class OrganizationDeduplicateWorkflowsEndpoint(OrganizationEndpoint): | ||
| permission_classes = (OrganizationWorkflowPermission,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we should allow all the same users to do this or just org admins. It's a scarier button than most of the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Org admins seems reasonable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| except IntegrityError: | ||
| # the DetectorWorkflow or AlertRuleWorkflow connections that we're attempting to create | ||
| # already exist. We should just continue with the rest of the process for this workflow. | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IntegrityError handling leaves transaction in aborted state
High Severity
The try/except IntegrityError block catches the exception but doesn't properly handle the PostgreSQL transaction state. In PostgreSQL, when an IntegrityError occurs within a transaction, subsequent operations fail with "current transaction is aborted." The delete operations on lines 175-185 that follow will fail. A nested transaction.atomic() block around the update operations is needed to create a savepoint that can be rolled back independently.
| "trigger_conditions": trigger_conditions, | ||
| } | ||
|
|
||
| return json.dumps(workflow_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing sort_keys causes inconsistent workflow hashing
Medium Severity
The final json.dumps(workflow_data) call lacks sort_keys=True, while the sorting operations (lines 73, 75, 86, 88) use it. Nested dicts like comparison, config, and self.workflow.config retain their original key order when serialized. Two semantically identical workflows with different dict key orderings (e.g., from different DB load orders) will produce different hashes and won't be recognized as duplicates.
Add
OrganizationDeduplicateWorkflowsEndpoint. This endpoint calls a method that deduplicates identical workflows for non-error detectors.