Skip to content

fix: retry unprocessed requests in RequestQueue.add_request#1976

Merged
vdusek merged 2 commits into
masterfrom
fix/add-request-retries-unprocessed
Jun 19, 2026
Merged

fix: retry unprocessed requests in RequestQueue.add_request#1976
vdusek merged 2 commits into
masterfrom
fix/add-request-retries-unprocessed

Conversation

@vdusek

@vdusek vdusek commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Description

RequestQueue.add_request (singular) made a single best-effort add_batch_of_requests([request]) call and, if the storage client returned the request as unprocessed, just logged a warning and returned None — silently dropping the request. add_requests (plural) already retries unprocessed requests via _process_batch, so the two adds had inconsistent durability against best-effort backends (e.g. the Apify platform's batch_add_requests endpoint, which may legitimately return a request as unprocessed).

This makes a single add as durable as a batched one:

  • _process_batch now returns an AddRequestsResponse aggregating the requests processed across all attempts plus any still unprocessed after the retries are exhausted (it previously returned None).
  • add_request routes its single request through _process_batch and returns processed_requests[0], returning None only after retries are exhausted.

The ProcessedRequest | None return contract and the blocking semantics are preserved (for one request, add_requests already runs the first batch synchronously), and the retry is safe because adds are idempotent by unique_key.

Note: on the failure path add_request is now blocking-with-backoff (worst case a few seconds of retry sleeps before returning None), where it previously returned immediately. That is the intended trade — durability over a fast silent drop.

Issues

Testing

  • Added test_add_request_retries_unprocessed (a request reported unprocessed on the first attempt is retried and survives) and test_add_request_returns_none_after_exhausting_retries (stays unprocessed across all attempts → returns None, 1 initial + 5 retries). Both fail on master and pass with this change.
  • uv run poe lint, uv run poe type-check, and the tests/unit/storages/test_request_queue.py suite all pass.

Checklist

  • CI passed
@github-actions github-actions Bot added this to the 143rd sprint - Tooling team milestone Jun 18, 2026
@github-actions github-actions Bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Jun 18, 2026
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.02%. Comparing base (970f93b) to head (5d551c2).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1976      +/-   ##
==========================================
+ Coverage   92.94%   93.02%   +0.07%     
==========================================
  Files         167      167              
  Lines       11737    11740       +3     
==========================================
+ Hits        10909    10921      +12     
+ Misses        828      819       -9     
Flag Coverage Δ
unit 93.02% <100.00%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@vdusek vdusek requested a review from Pijukatel June 18, 2026 14:47
@vdusek vdusek marked this pull request as ready for review June 18, 2026 14:47
@vdusek vdusek merged commit cfee910 into master Jun 19, 2026
34 checks passed
@vdusek vdusek deleted the fix/add-request-retries-unprocessed branch June 19, 2026 09:58
Mantisus pushed a commit to Mantisus/crawlee-python that referenced this pull request Jun 30, 2026
### Description

`RequestQueue.add_request` (singular) made a single best-effort
`add_batch_of_requests([request])` call and, if the storage client
returned the request as *unprocessed*, just logged a warning and
returned `None` — silently dropping the request. `add_requests` (plural)
already retries unprocessed requests via `_process_batch`, so the two
adds had inconsistent durability against best-effort backends (e.g. the
Apify platform's `batch_add_requests` endpoint, which may legitimately
return a request as unprocessed).

This makes a single add as durable as a batched one:

- `_process_batch` now returns an `AddRequestsResponse` aggregating the
requests processed across all attempts plus any still unprocessed after
the retries are exhausted (it previously returned `None`).
- `add_request` routes its single request through `_process_batch` and
returns `processed_requests[0]`, returning `None` only after retries are
exhausted.

The `ProcessedRequest | None` return contract and the blocking semantics
are preserved (for one request, `add_requests` already runs the first
batch synchronously), and the retry is safe because adds are idempotent
by `unique_key`.

Note: on the failure path `add_request` is now blocking-with-backoff
(worst case a few seconds of retry sleeps before returning `None`),
where it previously returned immediately. That is the intended trade —
durability over a fast silent drop.

### Issues

- Closes: apify#1975
- Surfaced as an intermittent e2e flake in apify/apify-sdk-python#1000

### Testing

- Added `test_add_request_retries_unprocessed` (a request reported
unprocessed on the first attempt is retried and survives) and
`test_add_request_returns_none_after_exhausting_retries` (stays
unprocessed across all attempts → returns `None`, 1 initial + 5
retries). Both fail on `master` and pass with this change.
- `uv run poe lint`, `uv run poe type-check`, and the
`tests/unit/storages/test_request_queue.py` suite all pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

3 participants