Match review-context rules on reviewer groups#6251
Open
msujaws wants to merge 4 commits into
Open
Conversation
a4ac18b to
3168d8d
Compare
Adds a cheap pre-screen so the (expensive) review agent only runs on revisions
that score low on risk and complexity, letting teams "ease in" by raising
thresholds as confidence grows.
Library (bugbug/tools/code_review/):
- diff_analysis.py: classify the patch's in-diff test signal (absent / sparse /
adequate / tests_only / no_code_change) and collect changed non-test paths.
- test_coverage.py: probe mozilla-central via the async searchfox client for
existing tests referencing the changed files (covered / partial / uncovered);
degrades gracefully if searchfox is unavailable.
- scoring.py: RiskComplexityScorer, a single structured Haiku call returning
0-10 risk/complexity plus factors, fed the precomputed <test_signals> block.
- agent.py: inject the test-signals block into the review prompt and surface
token usage + tool-call counts in the response details.
- llms.py: DEFAULT_SCORING_MODEL and a shared usage_from_messages() helper.
Service (reviewhelper-api):
- reviewer_groups.py + reviewer_groups.yaml: per-group thresholds (with
defaults) loaded from a checked-in YAML file; matching_groups() resolves a
revision's reviewer groups via the new PhabricatorPatch plumbing.
- review_processor.py: _score_and_gate() scores before reviewing and raises
ReviewSkipped("above_threshold") when either score meets its threshold.
- ReviewStatus.SKIPPED (terminal, no retry) + skipped_reason and
risk/complexity/test-signal columns on review_requests, recorded even when
skipped; alembic migration + status index.
- internal.py records skips as SKIPPED (HTTP 204); request.py messaging.
- pricing.py: per-model token pricing for later cost reporting.
Consume PhabricatorPatch.historical_reviewer_project_phids so a group that was assigned via a rotation (added, then swapped for an individual and removed) still matches. Add an `aliases` field to ReviewerGroup so a rotation project (e.g. home-newtab-reviewers-rotation) counts as the same logical group as home-newtab-reviewers, and resolve every slug (own + aliases) when matching. Falls back to the current reviewer snapshot for patches without history.
Builds on the reviewer-groups config to control rollout per team: - Only revisions requesting review from a group marked `enabled: true` are auto-reviewed (skip reason "not_enabled"), so teams can be turned on one at a time (e.g. ip-protection-reviewers first, home-newtab-reviewers later). - `restrict_to_member_authors` (default true) reviews only patches whose author is a member of the group's Phabricator project (skip "author_not_in_group"); an empty membership lookup over-includes rather than dropping everything. - An optional per-group `opt_out` PHID list excludes individual members (skip "author_opted_out"). These cheap metadata gates run before the paid scoring call. Adds tests for each gate.
Implements the previously-stubbed `review` predicate in the code-review external-content engine so a review-context.toml rule can target a reviewer group (the capability upstream left unimplemented): - review_context.py: implement _review_matches for `review.reviewer`, `review.blocking_reviewer`, and `review.author`. A predicate matches when every facet it specifies matches; reviewer groups are matched by Phabricator project slug (resolved via resolve_project_phid) against the revision's requested reviewers. Thread a ReviewInfo (author + reviewer/blocking group PHIDs) through rule_matches/predicate_matches/collect_actions, built from the patch in load_external_content_for_diff. Fails closed when no review info is available (e.g. a GitHub diff). - phabricator.py: expose blocking_reviewer_project_phids and author_username, refactoring the reviewers fetch into a shared _reviewers cached_property. - docs/code-review-skills.md: document the now-working `review` predicate. This replaces the earlier bespoke per-group skill mechanism: teams attach area guidance by reviewer group through the existing review-context.toml, with no second skill system. Depends on the reviewer-group plumbing from the PhabricatorPatch membership change.
3168d8d to
2dcfe1a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Depends on #6248, #6249, and #6250.
Implements the previously-stubbed
reviewpredicate in the code-review external-content engine so a review-context.toml rule can target a reviewergroup (a capability upstream left unimplemented). This replaces the earlier bespoke per-group skill mechanism: teams attach area guidance by reviewer group through the existing review-context.toml, with no second skill system.
review.reviewer,review.blocking_reviewer, andreview.author. A predicate matches whenevery facet it specifies matches; reviewer groups match by Phabricator project slug (via resolve_project_phid) against the revision's requested reviewers. Threads a ReviewInfo (author + reviewer/blocking group PHIDs) through rule_matches / predicate_matches / collect_actions, built from the patch in load_external_content_for_diff. Fails closed when no review info is available (e.g. a GitHub diff).
reviewpredicate.