🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
-
Updated
Jun 29, 2026 - Python
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
Scrapy, a fast high-level web crawling & scraping framework for Python.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Web Scraping Framework
📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告
🤖 Scrape data from HTML websites automatically by just providing examples
Scalable Python web scraping scripts for +40 popular domains
The New (auto rotate) Proxy [Finder | Checker | Server]. HTTP(S) & SOCKS 🎭
HTTP API for Scrapy spiders
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
ISP Data Pollution to Protect Private Browsing History with Obfuscation
Scrapy Extension for monitoring spiders execution.
🕷 Automatically detect changes made to the official Telegram sites, clients and servers.
Add a description, image, and links to the crawling topic page so that developers can more easily learn about it.
To associate your repository with the crawling topic, visit your repo's landing page and select "manage topics."