Screenshots are a convenient way to capture website data, especially when web scraping. Whether you want to track e-commerce product prices or visually verify that your script is rendering the right data, Python provides different options for taking screenshots.
In this tutorial, you'll learn how to take three types of screenshots while web scraping with Python:
- A screenshot of the visible part of the page
- A full-page screenshot
- A screenshot of a specific element.
We'll also show you how to avoid getting blocked in the process.
Let's go!
How to Take a Screenshot With Python and Selenium?
Scraping modern websites mostly involves navigating dynamic content. To access a target website's data, you must execute JavaScript.
That's why you need a browser automation Python library, such as Selenium. Selenium allows you to render web pages just like a regular browser, making it an excellent choice for taking screenshots while web scraping.
With Selenium, you can capture the part of the page immediately visible in your browser window, the full page, and specific elements.
Before you begin, ensure you have the Selenium library installed. You can do so using the following command.
pip3 install selenium
Also, here's a quick Selenium web scraping refresher.
from selenium import webdriver
# set up options to configure Chrome
options = webdriver.ChromeOptions()
# run in headless mode (no GUI)
options.add_argument("--headless=new")
# set window size
options.add_argument("--window-size=1920x1080")
# initialize the WebDriver with the specified options
driver = webdriver.Chrome(options=options)
# navigate to the target website
driver.get("https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/")
# retrieve the page title
title = driver.title
# print the title to verify the correct page has been loaded
print(title)
# clean up and close the browser
driver.quit()
This code navigates to a target web page and retrieves its page title.
For this tutorial, we'll use a demo e-commerce website to illustrate each option of taking screenshots.
Without further ado, let's dive in.
Option 1: Take a Screenshot of the Visible Part of the Screen
Sometimes, you only need to capture the part that's immediately visible (also called the viewport) when you navigate to a web page. For example, the image below shows your current view of our target page.
To get the viewport screenshot, use Selenium's save_screenshot()
method.
This method takes a single argument, a string that allows you to define the file name and path where you'll save the screenshot. For example, if you want to save the image file as viewport_screenshot.png
in your project directory, you'd pass that file name as the argument, like in the code snippet below.
# take a screenshot and save it to a file
driver.save_screenshot("viewport_screenshot.png")
Update the initial Selenium script with this code, and you'll have the following.
from selenium import webdriver
# set up options to configure Chrome
options = webdriver.ChromeOptions()
# run in headless mode (no GUI)
options.add_argument("--headless=new")
# set window size
options.add_argument("--window-size=1920x1080")
# initialize the WebDriver with the specified options
driver = webdriver.Chrome(options=options)
# navigate to the target website
driver.get("https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/")
print("Taking screenshot...")
# take a screenshot and save it to a file
driver.save_screenshot("viewport_screenshot.png")
print("Screenshot taken successfully.")
# clean up and close the browser
driver.quit()
Run it, and you'll get the same image above.
Option 2: Grab a Full-Page Screenshot
A full-page screenshot allows you to capture the content on the entire page, not just the visible portion. This can be useful when dealing with long-scrolling pages.
Here's the full-page screenshot of the target page.
Selenium's save_screenshot()
method is limited by the viewport. Thus, grabbing a full-page screenshot requires additional steps or browser-specific features.
Some browsers, for example, headless-mode Firefox, natively support full-page screenshots using the get_full_page_screenshot_as_file()
method. For others, such as Chrome, this method doesn't work. However, you can capture the full page by adjusting the viewport size to match the page's actual dimension before initiating the screenshot.
To achieve this, execute JavaScript to get the full page dimension. Then, set the window size to that dimension. You must run Chrome in headless mode for this to work.
# get the full page dimensions
width = driver.execute_script(f"return document.body.scrollWidth")
height = driver.execute_script(f"return document.body.scrollHeight")
# set window size to page's dimension
driver.set_window_size(width, height)
Update the initial script with the snippet above to get a complete code.
from selenium import webdriver
# set up options to configure Chrome
options = webdriver.ChromeOptions()
# run in headless mode (no GUI)
options.add_argument("--headless=new")
# set window size
options.add_argument("--window-size=1920x1080")
# initialize the WebDriver with the specified options
driver = webdriver.Chrome(options=options)
# navigate to the target website
driver.get("https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/")
# get the full page dimensions
width = driver.execute_script(f"return document.body.scrollWidth")
height = driver.execute_script(f"return document.body.scrollHeight")
# set window size to page's dimension
driver.set_window_size(width, height)
print("Taking screenshot...")
# take a screenshot and save it to a file
driver.save_screenshot("full_page_screenshot.png")
print("Screenshot successfully taken.")
# clean up and close the browser
driver.quit()
Well done, you've just captured a full-screen screenshot in Chrome!
Option 3: Capture a Screenshot of a Specific Element
Selenium also allows you to screenshot a specific element rather than the entire page. This is useful when focusing on a particular page item, such as charts, images, or forms.
Suppose you're interested in the product image above. You can grab its screenshot using the following steps.
Identify the product image selector. You may need to inspect the web page in a browser. You'll find that it's contained in a div
tag of class woocommerce-product-gallery__image
.
Using this information, select the product image. Then, grab the screenshot using the screenshot()
method, similar to save_screenshot()
# select the specific element
element = driver.find_element(By.CLASS_NAME, "woocommerce-product-gallery__image")
# capture the screenshot
element.screenshot("specific_element_screenshot.png")
Update the previous script with this snippet to get the final code for capturing the specific element screenshot.
from selenium import webdriver
from selenium.webdriver.common.by import By
# set up options to configure Chrome
options = webdriver.ChromeOptions()
# run in headless mode (no GUI)
options.add_argument("--headless=new")
# set window size
options.add_argument("--window-size=1920x1080")
# initialize the WebDriver with the specified options
driver = webdriver.Chrome(options=options)
# navigate to the target website
driver.get("https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/")
# select the specific element
element = driver.find_element(By.CLASS_NAME, "woocommerce-product-gallery__image")
print("Taking screenshot...")
# capture the screenshot
element.screenshot("specific_element_screenshot.png")
print("Screenshot successfully taken.")
# clean up and close the browser
driver.quit()
Your result will be the same as the screenshot above.
Congratulations! You've covered all screenshot options.
However, even if your scraper is technically capable of taking screenshots, it's still at risk of getting blocked. Let's see how to deal with this hurdle.
Avoid Blocks and Bans While Taking Screenshots With Python
Modern websites continuously employ advanced anti-bot systems that can flag your script and deny you access to website content. To scrape without getting blocked, you need to fortify your web scraper.
For instance, the screenshot scripts won't work for a G2 Reviews page. Here's what the result would look like.
That's because the web page uses advanced anti-bot protection, which blocks your request.
The best way to deal with such blocks and bans is using a web scraping API, such as ZenRows. ZenRows provides an effective solution for taking screenshots and scraping all website content uninterrupted. It automatically rotates premium proxies, optimizes your request headers, solves CAPTCHAs, and equips you with everything you need to avoid detection. What's more, it can successfully replace Selenium in your scraper.
To help you get started with ZenRows, below is a step-by-step guide on how to take a full-page screenshot of the G2 Reviews page that blocks the Selenium script.
Sign up to access the Request Builder page.
Input the target URL (https://www.g2.com/products/azure-sql-database/reviews
) and activate Premium Proxies and the JS Rendering mode.
Choose Python as a language option and select the API mode. ZenRows will generate your request code.
Copy the generated code to your preferred code editor or IDE. You also need to install a Python HTTP client like Requests.
ZenRows offers a screenshot_fullpage
parameter that allows you to grab a full-page screenshot with a single API call. Set this parameter to true and save the resulting image file.
You must set screenshot=true
to use the screenshot_fullpage
parameter.
Here's the complete code.
# pip install requests
import requests
url = "https://www.g2.com/products/azure-sql-database/reviews"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
"url": url,
"apikey": apikey,
"js_render": "true",
"premium_proxy": "true",
"screenshot": "true", # set screenshot parameter to true
"screenshot_fullpage": "true" # set screenshot_fullpage parameter to true
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
with open("g2_full_page_screenshot.png", "wb") as f:
f.write(response.content)
print ("Screenshot taken successfully")
Run it, and you'll get the full-page screenshot.
The image above is cropped, but you'll see a full-page screenshot in your folder.
Congratulations, you've successfully taken a screenshot of a heavily-protected web page!
Conclusion
In this article, you've learned to take three types of screenshots while web scraping using Python and Selenium.
However, you must remember that websites increasingly employ advanced restrictions that can block your request. To scrape any website and take screenshots without getting blocked, try ZenRows for free today.