Skip to content
View kawsarlog's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kawsarlog

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kawsarlog/README.md

Hi, I'm Kawsar 👋

Web Scraping & Data Automation Engineer | I build Python pipelines that pull clean data from sites most tools can't touch

🖊️ Love to Write code
📝 Website https://kawsarlog.com/
💬 Ask me about anything, i am happy to help :)


About

I've extracted 15M+ records for clients across real estate, healthcare, and e-commerce, without getting blocked once.

For 9+ years I've built Python scrapers and automation that pull clean, structured data from the platforms most tools choke on: Zillow, Realtor.com, LoopNet, Crexi, Amazon, and more. If it sits behind a login, a CAPTCHA, or a messy private API, I've probably already cracked it.

What I build:

  • Custom scrapers, Selenium, Playwright, curl_cffi, Apify actors
  • API reverse-engineering and GraphQL extraction
  • Async, proxy-rotating pipelines that run in the cloud
  • Verified B2B & real estate lead lists, plus AI-ready datasets

Most people find me after another scraper broke, got blocked, or buckled the moment they tried to scale it. That's the work I like.

The track record: Fiverr Pro (top 1%) · 200+ projects on Upwork · 1,000+ hours logged. Currently founder of bigiByte, part of the TocoLabs studio network.


Stack

Python Selenium Playwright Scrapy Pandas FastAPI Flask PostgreSQL SQLite MongoDB Docker AWS Linux Git

Scraping & automation: Selenium · Playwright · curl_cffi · Scrapy · BeautifulSoup · Apify · aiohttp · rotating proxies Data & backend: Python · Pandas · PostgreSQL · SQLite · MongoDB · FastAPI · Flask · REST · GraphQL Infra: Docker · AWS (Lambda, API Gateway) · Linux


📌 What I work on

  • Reverse-engineering platform APIs (HouseSigma, LoopNet, Crexi, Zillow) for clean, scalable data access
  • Apify actors for real estate, e-commerce, and lead-gen
  • Async pipelines processing 100K+ records per run with proxy rotation and resume capability
  • LLM-powered extraction and enrichment on top of raw scraped data

Problems I get hired to solve

The situation What I do
"The site blocks our scraper" Anti-bot bypass — Cloudflare, CAPTCHA, fingerprinting
"We need 100K rows, not 100" Async, proxy-rotating pipelines that scale and resume
"The data's behind a private API" Reverse-engineer it, pull it clean
"We need leads, not raw HTML" Verified B2B & real estate lists, deduped and structured
"Make it run on its own" Scheduled cloud jobs on AWS Lambda + API Gateway

GitHub Stats

stats top langs

streak

Connect

kawsarlog kawsarlog kawsarlog kawsarlog kawsarlog kawsarlog

Got a scraping job another tool couldn't handle? Let's talk →

Pinned Loading

  1. books-to-scrape-web-scraper books-to-scrape-web-scraper Public

    This repository contains a Python web scraper for extracting book data from the Books to Scrape website. The scraper gathers information such as titles, prices, availability, ratings, and thumbnail…

    Python 7 2

  2. amazon-reviews-extraction amazon-reviews-extraction Public

    🛍️📊 Effortlessly extract Amazon reviews using Python with the amazon-reviews-extraction script. This script makes use of popular Python modules like requests, pandas, bs4, and lxml to scrape and pa…

    Jupyter Notebook 26 9

  3. Excel-Image-Extractor Excel-Image-Extractor Public

    💡 You can easily extract photos 📸 from an Excel 📊 cell using this Python script. But wait, there's more! These photos can also be saved with names created by given cells. To discover how to make th…

    Python 12