#

scraping

Here are 378 public repositories matching this topic...

mendableai / firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

markdown crawler data scraper ai html-to-markdown web-crawler scraping webscraping rag llm ai-scraping

Updated Jul 3, 2025
TypeScript

crawlee

apify / crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Jul 3, 2025
TypeScript

jaypyles / Scraperr

Self-hosted webscraper.

python docker kubernetes opensource helm scraping webscraper web-scraper self-hosted web-scraping web-scrapers webscraping playwright

Updated Jun 18, 2025
TypeScript

apify / fingerprint-suite

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

typescript scraping fingerprinting puppeteer playwright

Updated Jul 3, 2025
TypeScript

MiddleSchoolStudent / BotBrowser

🤖 Bypasses Cloudflare, Shape, PerimeterX, Datadome, Akamai, Kasada, hCaptcha, FunCaptcha and reCAPTCHA with unmatched reliability - powered by a modified Chromium core

recaptcha scraping cloudflare akamai perimeterx incapsula puppeteer antibots datadome hcaptcha funcaptcha kasada shapesecurity adscore

Updated Jun 26, 2025
TypeScript

ulixee / secret-agent

The web scraper that's nearly impossible to block - now called @ulixee/hero

browser proxy scraping devtools mitm chromium stealth automated mitmproxy puppeteer playwright secretagent

Updated Mar 7, 2023
TypeScript

josephlimtech / linkedin-profile-scraper-api

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Updated Apr 5, 2024
TypeScript

adrianhajdin / pricewise

Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails, deployment, and more.

scraping webscraping

Updated Jul 6, 2024
TypeScript

any4ai / AnyCrawl

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

data html-to-markdown scraping webscraper crawl scrape serp rag aitools ai-scraping

Updated Jul 3, 2025
TypeScript

devflowinc / firecrawl-simple

➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.

search markdown crawler data scraper ai html-to-markdown web-crawler scraping embeddings webscraping rag llm ai-scraping

Updated May 23, 2025
TypeScript

tinking

baptisteArno / tinking

🧶 Extract data from any website without code, just clicks.

scraping scrapping scrapper scraping-websites harvesting puppeteer

Updated Apr 15, 2021
TypeScript

PawanOsman / GoogleBard

GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS

api google ai reverse-engineering scraping prompt assistant assistant-chat-bots chatgpt google-bard

Updated Jan 11, 2024
TypeScript

libremdb

zyachel / libremdb

A free & open source IMDb front-end.

sass front-end privacy typescript scraping foss imdb alternative-frontends

Updated Jun 28, 2025
TypeScript

drudge / n8n-nodes-puppeteer

n8n node for browser automation using Puppeteer

pdf screenshot screenshots browser script scraping proxy-server chromium scrape puppeteer n8n n8n-nodes stealth-mode

Updated Jun 30, 2025
TypeScript

floriandiud / facebook-group-members-scraper

Facebook Group Members Extractor. Download Facebook group members in CSV.

facebook csv scraping growth growth-hacking facebook-scraper facebook-data-extract facebook-scraping facebook-data-scraper

Updated Sep 13, 2024
TypeScript

shihabmridha / educative.io-downloader

Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.

nodejs pdf typescript scraping hacktoberfest puppeteer educativeio

Updated Apr 13, 2024
TypeScript

bitmakerla / estela

estela, an elastic web scraping cluster 🕸

react python docker kubernetes scraper django scraping crawling requests web-scraping scrapy hacktoberfest python-requests scrapyd scrapy-visualization webscraping-python

Updated May 28, 2025
TypeScript

Anish-Agnihotri / tweetdrop

Generate dispersable airdrops from Twitter threads.

twitter crypto twitter-api ethereum scraping tweet airdrop

Updated Jan 3, 2022
TypeScript

unblocked-web / double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

scraping crawling scrapy puppeteer secret-agent

Updated Oct 31, 2022
TypeScript

mtwn105 / decipher-research-agent

Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.

agent ai mcp scraping ml artificial-intelligence gemini web-scraping openai qdrant llm vector-db crewai notebooklm agentic-ai model-context-protocol

Updated Jun 6, 2025
TypeScript

Improve this page

Add a description, image, and links to the scraping topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scraping topic, visit your repo's landing page and select "manage topics."