Skip to content

fix: retry sitemap fetching on error and raise when retries are exhausted#1943

Merged
vdusek merged 2 commits into
masterfrom
fix/sitemap-fetch-retries
Jun 4, 2026
Merged

fix: retry sitemap fetching on error and raise when retries are exhausted#1943
vdusek merged 2 commits into
masterfrom
fix/sitemap-fetch-retries

Conversation

@vdusek

@vdusek vdusek commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Description

The try/except in _fetch_and_process_sitemap wrapped the entire while retries_left > 0 loop instead of the loop body. Any exception raised while streaming a sitemap therefore exited the loop after a single attempt, logged a warning, and silently ended the async generator — no retry ever happened, sitemap_retries was effectively dead code, and callers (Sitemap.load, parse_sitemap, SitemapRequestLoader) saw empty or partial results with no error.

This PR moves the try/except inside the loop body so each attempt is independently retried, and raises the error once all retries are exhausted instead of swallowing it.

Behavior notes

  • A sitemap that keeps failing after all retries now raises to the caller instead of silently producing empty results. SitemapRequestLoader already logs and propagates this via its own error handling.
  • If a stream fails after yielding some items, the retry re-fetches the sitemap from scratch, so items yielded before the failure may be yielded again. This is pre-existing streaming-generator semantics, now just reachable via the (working) retry path.
…sted

The try/except in _fetch_and_process_sitemap wrapped the whole retry
loop instead of its body, so any fetch error exited the loop after a
single attempt and was silently swallowed, leaving callers with empty
or partial results. Each attempt is now independently retried and the
error is raised once all retries are exhausted.
@vdusek vdusek added t-tooling Issues with this label are in the ownership of the tooling team. adhoc Ad-hoc unplanned task added during the sprint. labels Jun 3, 2026
@vdusek vdusek self-assigned this Jun 3, 2026
@github-actions github-actions Bot added this to the 142nd sprint - Tooling team milestone Jun 3, 2026
@github-actions github-actions Bot added the tested Temporary label used only programatically for some analytics. label Jun 3, 2026
@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.03%. Comparing base (f7cde2e) to head (c925b9b).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1943      +/-   ##
==========================================
+ Coverage   92.92%   93.03%   +0.10%     
==========================================
  Files         167      167              
  Lines       11710    11712       +2     
==========================================
+ Hits        10882    10896      +14     
+ Misses        828      816      -12     
Flag Coverage Δ
unit 93.03% <100.00%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@vdusek vdusek requested a review from Mantisus June 3, 2026 18:04
@vdusek vdusek marked this pull request as ready for review June 3, 2026 18:04

@Mantisus Mantisus left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vdusek vdusek requested a review from janbuchar June 4, 2026 10:59
…etries

# Conflicts:
#	tests/unit/_utils/test_sitemap.py
@vdusek vdusek merged commit 76927d7 into master Jun 4, 2026
33 checks passed
@vdusek vdusek deleted the fix/sitemap-fetch-retries branch June 4, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

4 participants