Scrape websites asynchronously with Python 3.8+, Asyncio, & arsenic (aka Selenium for Async).

codingforentrepreneurs codingforentrepreneurs Last update: Nov 16, 2023

Superchaged Web Scraping with Asyncio Logo

Supercharged Web Scraping with Asyncio

Web scraping is simply automatically opening up any website and grabbing the data you find important on that website. It's fundamental to the internet, search engines, Data Science, automation, machine learning, and much more.

Opening websites and extracting data are only part of what makes web scraping great. It's the parsing of the data that's where the value is.

This project will cover:

  • Basic web scraping with Python
  • Web scraping with Selenium
  • Sync vs Async
  • Asynchronous Web scraping with Asyncio

Requirements:

Watch the series

To use this code:

1. Clone

git clone https://github.com/codingforentrepreneurs/Supercharged-Web-Scraping-with-Asyncio supercharged

2. Create Virtual Environment

cd supercharged
python3.6 -m venv .

3. Activate virtual environment and install requirements Mac/Linux

source bin/activate

Windows:

.\Scripts\activate

If using pipenv, run pipenv shell && pipenv install

Run jupyter

jupyter notebook

or

python -m jupyter notebook

If using pipenv, run pipenv run jupyter notebook

Subscribe to our newsletter