Are you looking for the best way to capture screenshots while testing a website or web scraping with Watir?
You're in the right place! In this article, you'll learn three ways of taking screenshots with Watir and Ruby:
- Generating a screenshot for the visible part of the screen.
- Capturing a full-page screenshot.
- Creating a screenshot of a specific element.
Let's go!
How to Take a Screenshot With Watir?
Watir (Web Application Testing in Ruby) is an open-source Ruby library for automating web browsers. It provides several functionalities for interacting with web elements, including capturing screenshots. Using Watir, you can screenshot the visible part of the page, the entire page, or specific elements.
First, set up a base script to access a website using the Watir headless browser. We'll modify it later to take screenshots. For this tutorial, we'll use ScrapingCourse as a demo target website.
Run the following command in the terminal to install the Watir gem:
gem install watir
Once you installed Watir, import it into your script and initialize a Chrome browser in headless mode. Then, navigate to the ScrapingCouse e-commerce page, grab its HTML content, print it, and close the browser.
require 'watir'
# initialize the browser
browser = Watir::Browser.new :chrome, headless: true
# navigate to the URL
url = 'https://www.scrapingcourse.com/ecommerce/'
browser.goto(url)
# get the page content
page_content = browser.html
puts page_content
# close the browser
browser.close
This code will print the full-page HTML of the target page. We've formatted the output to make it more readable:
<html lang="en-US">
<head>
<!--- ... --->
<title>Ecommerce Test Site to Learn Web Scraping - ScrapingCourse.com</title>
<!--- ... --->
</head>
<body class="home archive ...">
<p class="woocommerce-result-count">Showing 1-16 of 188 results</p>
<ul class="products columns-4">
<!--- ... --->
<li>
<h2 class="woocommerce-loop-product__title">Abominable Hoodie</h2>
<span class="price">
<span class="woocommerce-Price-amount amount">
<bdi>
<span class="woocommerce-Price-currencySymbol">$</span>69.00
</bdi>
</span>
</span>
<a aria-describedby="This product has multiple variants. The options may ...">Select options</a>
</li>
<!--- ... other products omitted for brevity --->
</ul>
</body>
</html>
Awesome! Your Watir script is now ready to integrate the screenshot-capturing functionalities.
Option 1: Generate a Screenshot for the Visible Part of the Screen
The part of a web page you can see after the page loads and before you scroll down is called the viewport, or simply the visible part of the screen.
We'll capture the following viewport screenshot of a product page from the demo website:
You can capture viewport screenshots using Watir's built-in screenshot
method. Modify the previous base script and include this method to save the screenshot to an image file.
Here's what your modified script should look like:
require 'watir'
# initialize the browser
browser = Watir::Browser.new :chrome, headless: true
# navigate to the URL
url = 'https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/'
browser.goto(url)
# save screenshot to file
browser.screenshot.save 'viewport_screenshot.png'
# close the browser
browser.close
Good job! You just took a screenshot of the viewport in Ruby using the Watir library.
Now, let's grab a full-page screenshot!
Option 2: Capture a Full-Page Screenshot
A full-page screenshot captures the entire web page in a single image, including both the visible portion (viewport) and the part that you need to scroll to view.
We'll grab the following full-page screenshot of the same product page:
Watir doesn't provide any built-in method for capturing full-page screenshots. You need to use the watir-screenshot-stitch
extension for this.
Install the extension by running the following command in the terminal:
gem install watir-screenshot-stitch
This library uses the html2canvas JavaScript library to create a canvas element for the entire page.
First, it takes a screenshot and encodes it in Base64 format. Then, it opens a file in binary write mode and writes the decoded screenshot data to it, resulting in a saved image of the full web page.
Here's how your modified base script should look after integrating this library:
require 'watir'
require 'watir-screenshot-stitch'
# initialize the browser
browser = Watir::Browser.new :chrome, headless: true
# navigate to the URL
url = 'https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/'
browser.goto(url)
# save screenshot to file
png = browser.screenshot.base64_canvas
path = "full_page_screenshot.png"
File.open(path, 'wb') { |f| f.write(Base64.decode64(png)) }
# close the browser
browser.close
Good job! You extended Watir's capabilities to take a full-page screenshot of the target website.
However, it's important to note that watir-screenshot-stitch
has some limitations. As per the official documentation, it can't display certain types of elements due to the inherited limitations of html2canvas.
Option 3: Create a Screenshot of a Specific Element
To generate a screenshot of a specific element of the web page, you need to point your scraper to the target element. Let's capture the following product summary element of a product page:
Watir doesn't have built-in functionality for capturing screenshots of specific elements. To integrate this functionality, you'll have to use the watir-extensions-element-screenshot
library.
Run the following command to install it:
gem install watir-extensions-element-screenshot
Disable your scraper's headless mode and maximize the browser window to use this library without errors.
The target product summary element is enclosed within a div
tag with the entry-summary
class. We'll use this class attribute to locate the target in our scraper.
Here's your modified base code after integrating the watir-extensions-element-screenshot
library:
require 'watir'
require 'watir/extensions/element/screenshot'
# initialize the browser
browser = Watir::Browser.new :chrome
# maximize the browser window
screen_width = browser.execute_script('return screen.width;')
screen_height = browser.execute_script('return screen.height;')
browser.driver.manage.window.resize_to(screen_width,screen_height)
browser.driver.manage.window.move_to(0,0)
# navigate to the URL
url = 'https://www.scrapingcourse.com/ecommerce/product/abominable-hoodie/'
browser.goto(url)
# save screenshot to file
browser.div(:class => 'entry-summary').screenshot('specific_element_screenshot.png')
# close the browser
browser.close
Running this code will give you the following output:
As you may have noticed, the output is different from the one we tried to capture. It's because the library isn't actively maintained and can't crop the image.
Unfortunately, such limitations are common with external libraries, making them unreliable for large-scale web scraping.
Let's see an effective method to capture screenshots in the next section.
Avoid Blocks and Bans While Taking Screenshots With Watir
Websites with anti-bot protection systems are one of the biggest web scraping hurdles. They can prevent you from taking screenshots or even put you at risk of permanent bans. You need to bypass these systems to scrape without getting blocked.
For instance, the above methods for taking screenshots will not work for heavily protected websites like G2 Reviews.
See for yourself. Take the viewport screenshot script and replace the target URL with a G2 Reviews page.
require 'watir'
# initialize the browser
browser = Watir::Browser.new :chrome, headless: true
# navigate to the URL
url = 'https://www.g2.com/products/asana/reviews'
browser.goto(url)
# save screenshot to file
browser.screenshot.save 'g2_blocked_screenshot.png'
# close the browser
browser.close
Instead of a G2 Reviews page, you'll get the following blocked screenshot:
The best solution to avoid this block is to use a web scraping API like ZenRows. It acts as a headless browser and automatically integrates auto-rotating premium proxies, optimized headers, anti-CAPTCHAs, and more technologies that help avoid blocks and bans.
ZenRows provides a complete solution for capturing any type of screenshots without the risk of being blocked. Whether you need to take a viewport screenshot, capture the entire page, or generate a screenshot of a specific element, ZenRows' screenshot feature will handle it for you.
Let's use the same G2 Reviews page to capture its viewport screenshot, which blocked us previously.
Sign up to open the Request Builder. Paste the target URL in the link box, activate Premium Proxies, and toggle on JS Rendering. Select Ruby as your preferred language and click the API request mode. Then, copy and paste the generated code into your script.
The generated code uses Ruby's Faraday library to make HTTP requests. Install it using the following command:
gem install faraday
Modify the generated code to include the &screenshot=true
parameter in the API endpoint.
Here's the final script to capture a viewport screenshot of an anti-bot protected page using ZenRows:
# gem install faraday
require 'faraday'
url = URI.parse('https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2Fproducts%2Fasana%2Freviews&js_render=true&premium_proxy=true&screenshot=true')
# initialize Faraday connection with a 180-second timeout
conn = Faraday.new()
conn.options.timeout = 180
# make GET request and store response
res = conn.get(url, nil, nil)
# write response body to a file
File.open('g2_screenshot.png', 'wb') { |file| file.write(res.body) }
You'll get the following output on running this code:
Congratulations! You bypassed an anti-bot protected website and successfully took a screenshot.
Conclusion
In this tutorial, you've learned the three methods of taking screenshots in Watir:
- Capturing an above-the-fold screenshot of the web page.
- Getting a full-page screenshot, including the parts beyond the viewport.
- Screenshotting a specific web element.
You've also seen Watir's screenshot limitations, including its vulnerability against anti-bot measures. To successfully bypass all the bots, use ZenRows, an all-in-one tool for web scraping and screenshot capturing. Try ZenRows for free!