Tutorials and notes on extracting data from the web with Python — requests, BeautifulSoup, Playwright, and the patterns that make scrapers survive.
Recent posts
Apr 14, 2026
Playwright Stealth and Single-Page Apps
Mar 24, 2026
Deploying a Scraper: Cron, Docker, Lambda
Mar 3, 2026
CAPTCHAs: What Works, What's Legal, What's Not
Feb 3, 2026
Async Scraping with httpx and asyncio
Jan 13, 2026
Scrapy Pipelines and Middlewares Explained
Dec 30, 2025
Scrapy Basics: When to Upgrade from requests
Dec 9, 2025
Storing Scraped Data: CSV, SQLite, Postgres
Nov 18, 2025
Rate Limiting and Being a Polite Scraper
Oct 21, 2025
User-Agents and Browser Fingerprinting
Sep 30, 2025
Proxies and Rotating IPs: When You Actually Need Them
Sep 9, 2025
Headers, Cookies, and Sessions in requests
Aug 26, 2025
Find the Hidden JSON API Behind Any Site
Aug 5, 2025
Five Pagination Patterns and How to Scrape Them
Jul 15, 2025
Playwright for JavaScript-Rendered Pages
Jun 17, 2025
Scraping with lxml and XPath
May 27, 2025
BeautifulSoup Selectors: A Practical Deep Dive
May 6, 2025
Robust HTTP: Errors, Retries, and Exponential Backoff
Apr 22, 2025
Getting Started with requests and BeautifulSoup
Apr 1, 2025
Welcome to Web Scraping Python