Skip to content

feat: use SessionPool in BasicCrawler#128

Merged
janbuchar merged 12 commits into
masterfrom
use-session-pool-in-basic-crawler
May 2, 2024
Merged

feat: use SessionPool in BasicCrawler#128
janbuchar merged 12 commits into
masterfrom
use-session-pool-in-basic-crawler

Conversation

@janbuchar

@janbuchar janbuchar commented Apr 23, 2024

Copy link
Copy Markdown
Collaborator
  • closes Integrate SessionPool into BasicCrawler #110

  • BasicCrawler now uses SessionPool to fill in a session into the crawling context

  • there is a separate retry mechanism for session errors (when we get blocked)

  • cookies from HTTP responses are persisted in the respective sessions

@janbuchar janbuchar requested a review from vdusek April 23, 2024 14:59
@github-actions github-actions Bot added this to the 88th sprint - Tooling team milestone Apr 23, 2024
@github-actions github-actions Bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Apr 23, 2024
@vdusek vdusek removed their request for review April 23, 2024 16:02
@janbuchar janbuchar force-pushed the use-session-pool-in-basic-crawler branch from 27403d4 to 0229ad8 Compare April 26, 2024 09:20
@janbuchar janbuchar requested a review from vdusek May 2, 2024 08:32
@janbuchar janbuchar marked this pull request as ready for review May 2, 2024 08:33

@vdusek vdusek left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few questions

Comment thread src/crawlee/basic_crawler/basic_crawler.py
Comment thread src/crawlee/basic_crawler/basic_crawler.py
Comment thread src/crawlee/basic_crawler/basic_crawler.py
Comment thread src/crawlee/sessions/session.py

@vdusek vdusek left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@janbuchar janbuchar merged commit 9fc4648 into master May 2, 2024
@janbuchar janbuchar deleted the use-session-pool-in-basic-crawler branch May 2, 2024 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

2 participants