5 Best Python Web Scraping Libraries in 2023
Struggling to find the best Python web scraping library to use? You aren't alone. It can get pretty troublesome when you settle for a scraping library and it fails, probably because it's slow or keeps on getting detected by antibots.
A good Python library for web scraping should be fast, scalable and capable of crawling any type of web page. In this article, we'll be discussing the 5 best libraries for crawling in Python, their pros and cons, as well as providing a quick example to help you understand how they work.
What are the best Python web scraping libraries?
We did some background tests to check and verify which Python web scraping library is capable of scraping a web page without problems. The best ones are ZenRows, Selenium, Requests, Scrapy and urllib3. Here are some features worth mentioning:
Library | Ease of use | Performance | Dynamic Data |
---|---|---|---|
ZenRows | Easy to use | It's fast for static content and moderate for dynamic content. Consumes fewer resources compared to other libraries. | |
Selenium | Quite difficult to use compared to libraries like Requests and urllib3. | Slow and it consumes high resources. | |
Requests | One of the easiest web scraping libraries to use but have fewer capabilities | Fast and low resource consumption. | - |
Scrapy | Difficult to learn compared to the other Python web scraping libraries. | Fast and medium resource consumption. | - |
urllib3 | Similar to requests but with a lower-level API. | It's fast and consumes low resources. | - |
Let's go into detail and discuss these libraries with some Python web scraping examples. We'll be extracting the product details on the Vue Storefront with each of them.

1. ZenRows
ZenRows API is a Python web scraping library capable of bypassing some of the biggest scraping problems, like anti-bots and CAPTCHAs. Some of its features include rotating & premium proxies, headless browser, geo-targeting, antibot and so on.
- ZenRows is easy to use.
- It can easily bypass CAPTCHAs and antibots.
- Smart rotational proxies.
- It can scrape JavaScript-rendered pages.
- It also works with other libraries.
- It's a paid service but it comes with a free trial.
How to scrape a web page with ZenRows
Step 1: Generate the Python code
To get started, create a free ZenRows account and navigate to the dashboard. From the dashboard, select Python and enter the target website's URL.

Since our target web page is dynamically generated, activate the JavaScript rendering option by selecting it and, from the options shown, select JavaScript instructions. For this example, you need to include the "fill" key, this is a list with the ID of the search box ("#search") and the word "laundry".

The "wait_for" key makes the script wait for a specific item to appear, in this case the items with a class of "sf-product-card__title". The "wait" parameter is optional, indicating how many milliseconds to wait before retrieving the information.
Step 2: Parse the response
Since ZenRows has limited support for parsing the HTML generated, we'll be using BeautifulSoup since it has different methods, like find
and find_all
that can help get elements with specific IDs or classes from the HTML tree.
Go ahead and import the library, then create a new BeautifulSoup object by passing the extracted data from the URL. Then assign a second parameter, which is the parser, and it can be 'html.parser', 'xml' or 'lxml'. Make a new file called zenrowsTest.py and paste the code:
from zenrows import ZenRowsClient
from bs4 import BeautifulSoup
import json
client = ZenRowsClient("YOUR_API_KEY")
url = "https://demo.vuestorefront.io/"
js_instructions = [
{"wait":500},
{"fill":["#search","laundry"]},
{"wait_for":".sf-product-card__title"}
]
params = {
"js_render":"true",
"js_instructions": json.dumps(js_instructions),
}
response = client.get(url, params=params)
soup = BeautifulSoup(response.text, "html.parser")
for item in soup.find_all("span", {"class": "sf-product-card__title"}):
print(item.text)
Congratulations! You have successfully scraped a web page using ZenRows. Here's what the output looks like:
[Sample] Canvas Laundry Cart
[Sample] Laundry Detergent
2. Selenium
Selenium is a Python scraping library widely used and is capable of scraping dynamic web content. With this library, you can simulate dynamic actions performed on a website, like clicking a button, filling forms and more.
- It can scrape dynamic web pages.
- Selenium can be slow.
- It can't get status codes.
How to scrape a web page with Selenium
Step 1: Find the input tag
To scrape a web page using Selenium, you can make use of a WebDriver and then locate the input tag element (the search box) by using the find_element
method. After finding the correct input element, you have to write the desired query and hit Enter.
Step 2: Retrieve the span tags
Once you've found the elements, you can find the span tags of the returned items. Since the server can take too long to return the results, you can make use of WebDriverWait
to wait for the server to show the results.
Once the items are available, get them by giving their class name as a parameter to the find_elements
method. Here's everything we've just mentioned:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
url = "https://demo.vuestorefront.io/"
with webdriver.Chrome(service=ChromeService(ChromeDriverManager().install())) as driver:
driver.get(url)
input = driver.find_element(By.CSS_SELECTOR, "input[type='search']")
input.send_keys("laundry" + Keys.ENTER)
el = WebDriverWait(driver, timeout=3).until(
lambda d: d.find_element(By.CLASS_NAME, "sf-product-card__title"))
items = driver.find_elements(By.CLASS_NAME, "sf-product-card__title")
for item in items:
print(item.text)
After running the code, you should see the names of the two items printed on the console:
[Sample] Canvas Laundry Cart
[Sample] Laundry Detergent
And there you have it!
3. Requests
Requests is a user-friendly web scraping library in Python built on top of urllib3. It can directly get a URL without a PoolManager instance and, once you make a GET request, you can access the contents of the web page by using the content property on the response object.
- It doesn't require
PoolManager
. - It's fast.
- It can't scrape interactive or dynamic sites with JavaScript.
How to scrape a web page using Requests
Let's work with a Vue Storefront page with a list of kitchen products. There are five items listed on the website and each one of them has a title on a span tag with a class of sf-product-card__title
.

Step 1: Get the main contents with the GET method
Use this code:
import requests
r = requests.get('https://demo.vuestorefront.io/c/kitchen')
The GET method returns a response object, from which you can obtain the status code with the status_code
property (in this case, it returns code 200) and the HTML data with the content property. The response object is saved in the variable "r".
Step 2: Extract the specific information with BeautifulSoup
Extract the span tags with the class of sf-product-card__title
by using the find_all
method on the BeautifulSoup object:
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content, 'html.parser')
for item in soup.find_all('span', {'class': 'sf-product-card__title'}):
print(item.text)
This will return a list of all the span tags with class found on the document and, using a simple for
loop, you can print the desired information on screen. Let's make a new file called requestsTest.py and write the following code:
import requests
from bs4 import BeautifulSoup
r = requests.get('https://demo.vuestorefront.io/c/kitchen')
soup = BeautifulSoup(r.content, 'html.parser')
for item in soup.find_all('span', {'class': 'sf-product-card__title'}):
print(item.text)
Congratulations! You made it, you've successfully used the Request Python library for web scraping. Your output should look like this:
[Sample] Tiered Wire Basket
[Sample] Oak Cheese Grater
[Sample] 1 L Le Parfait Jar
[Sample] Chemex Coffeemaker 3 Cup
[Sample] Able Brewing System
4. Scrapy
Scrapy is a high-level framework that can be used to scrape data from highly complex websites. With Scrapy, it's possible to bypass CAPTCHAs using predefined functions or external libraries. You can write a simple Scrapy crawler to scrape data from a website by using an object definition by means of a Python class, but it's not very user-friendly compared to other Python scraping libraries.
Although the learning curve for this library is steep, you can do a lot with it and it's highly efficient in performing crawling tasks.
- General framework for scraping purposes.
- It doesn't require BeautifulSoup.
- Steep learning curve.
- Scrapy can't scrape dynamic web pages.
How to scrape a web page using Scrapy
Step 1: Create a Spider class
Make a new class named kitchenSpider
and give it the parameter scrapy.Spider
. Inside the class, define the name as mySpider
, and start_urls
as a list of the URLs to scrape.
import scrapy
class kitchenSpider(scrapy.Spider):
name='mySpider'
start_urls = ['https://demo.vuestorefront.io/c/kitchen',]
Step 2: Define the parse method
The parse method takes a response parameter and you can retrieve each item with the CSS
method on the response object. The CSS
method can take the name of the item class as its parameter:
response.css('.sf-product-card__title')
To retrieve all the items with that class, make a for loop and print the contents with the xpath method:
for item in response.css('.sf-product-card__title'):
print(item.xpath('string(.)').get())
Make a new file called scrapyTest.py using the code below:
import scrapy
class kitchenSpider(scrapy.Spider):
name='mySpider'
start_urls = ['https://demo.vuestorefront.io/c/kitchen',]
def parse(self, response):
for item in response.css('.sf-product-card__title'):
print(item.xpath('string(.)').get())
Run the spider by executing the following script in the terminal and you should see the list of items printed on screen:
scrapy runspider scrapyTest.py
[Sample] Tiered Wire Basket
[Sample] Oak Cheese Grater
[Sample] 1 L Le Parfait Jar
[Sample] Chemex Coffeemaker 3 Cup
[Sample] Able Brewing System
That's it!
5. urllib3
urllib3 is a library that depends on other Python web scraping libraries. It works with a PoolManager instance (class), a response object that manages connection pooling and thread safety.
- Handles concurrency with PoolManager.
- Complicated syntax compared to other libraries like Requests.
- urllib3 can't extract dynamic data.
How to scrape a web page using urllib3
Step 1: Create a PoolManager instance
Import the urllib3 library, then create a PoolManager instance and save it to a variable called http
import urllib3
http = urllib3.PoolManager()
Once a PoolManager instance is created, you can make an HTTP get request by using the request()
method on the PoolManager instance.
Step 2: Make a get request
Use the request method on the PoolManager instance. You can give the request method two parameters to make a simple get request. For the case of get requests, the first parameter is the string GET
and the second is the string given by the URL you are trying to scrape:
r = http.request('GET', 'https://demo.vuestorefront.io/c/kitchen')
Step 3: Extract the data from the response object
The request response is given by an HTTPResponse object and, from this object, you can obtain information such as the status code, data and so on. Let's get the data by using the data method on the response object and BeautifulSoup:
soup = BeautifulSoup(r.data, 'html.parser')
To extract the data, use a for
loop with the find_all
method and the name of the item's class:
for item in soup.find_all('span', {'class': 'sf-product-card__title'}):
print(item.text)
Create a new file called urllib3Test.py with the following code:
import urllib3
from bs4 import BeautifulSoup
http = urllib3.PoolManager()
r = http.request('GET', 'https://demo.vuestorefront.io/c/kitchen')
soup = BeautifulSoup(r.data, 'html.parser')
for item in soup.find_all('span', {'class': 'sf-product-card__title'}):
print(item.text)
And that's it! You have successfully scraped the data from the kitchen category on the Vue Storefront using the urllib3 Python web scraping library.
[Sample] Tiered Wire Basket
[Sample] Oak Cheese Grater
[Sample] 1 L Le Parfait Jar
[Sample] Chemex Coffeemaker 3 Cup
[Sample] Able Brewing System
Conclusion
- ZenRows.
- Selenium.
- Requests.
- Scrapy.
- urllib3.
A common problem with web scraping libraries for Python is their inability to avoid bot detection while scraping a web page, and this makes scraping difficult and stressful. ZenRows solves this problem with a single API call. Take advantage of the current free trial and get 1,000 API credits for free.
Frequently Asked Questions
Why are Python libraries for web scraping important?
Python is one of the most popular languages that developers use to build web scrapers since its classes and objects are significantly easier to use compared to any other language. However trying to build a custom crawler from scratch on Python will be difficult, especially when you have to scrape a lot of custom websites and antibot measures are in place. Python web crawling libraries cut down the lengthy process and make it easy for you to scrape a web page.
Which libraries are used for web scraping in Python?
There are many Python web scraping libraries to choose from. The most popular ones are ZenRows, Selenium, Requests, Scrapy and urllib3.
What is the best Python web scraping library?
The best Python web scraping library to use is ZenRows. Of course, other libraries can get the job done, but the time and effort spent on learning these tools and the possibility of getting your scraper blocked are headaches that can be avoided easily with ZenRows.
What is the most popular Python library for web scraping?
The Requests library is one of the most used web scraping libraries since it helps make basic requests for further analysis.
Did you find the content helpful? Spread the word and share it on Twitter, LinkedIn, or Facebook.