Do you want to bypass DataDome's security measures while web scraping with Selenium? You're in the right place.
In this article, you'll learn how DataDome works and explore the best five methods to bypass it with Selenium. After reading, you'll know how to:
- Use ChromeDriver for stealth mode.
- Add premium proxies.
- Integrate a web scraping API into your workflow.
- Optimize your User Agent.
- Use Selenium Stealth.
Let's go!
What Is DataDome and How It Works?
DataDome is a web security solution that protects websites against digital threats, including account takeover, DDOS, ad and payment fraud, and more. It also detects web scraping activities and can block your scraper from extracting your target data.
DataDome detects bot-like activities by monitoring IP addresses, inspecting the request headers, and analyzing usage behavior such as mouse movement, navigation, or click patterns. Since it uses advanced techniques, like TLS fingerprinting or machine learning, bypassing DataDome is very challenging.
Is Base Selenium Enough to Bypass DataDome?
Selenium's ability to execute JavaScript with a headless browser makes it a great tool for web scraping. However, Selenium can't handle DataDome's advanced machine learning and fingerprinting measures, so scraping protected websites requires a few extra steps.
See for yourself how Selenium gets blocked while accessing a DataDome-protected website. Let's take Best Western as an example:
Try accessing and screenshotting the page with the following Python code to see DataDome's response:
# import the required library
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# set up Chrome in headless mode
chrome_options = Options()
chrome_options.add_argument("--headless")
# start a WebDriver instance
driver = webdriver.Chrome(options=chrome_options)
# visit the target web page
driver.get("https://www.bestwestern.com/")
# take a screenshot of the web page
driver.save_screenshot("best-western-screenshot.png")
The code returns the following screenshot, indicating that Selenium got blocked while accessing the DataDome-protected web page:
Luckily, there are a few handy ways to deal with this problem. You'll learn them all in the next section.
5 Methods to Bypass DataDome With Selenium
Selenium easily gets blocked when trying to access DataDome-protected websites, but there are ways to make it work. Let's go through five methods of enhancing Selenium's bypassing abilities.
Method #1: Use Undetected ChromeDriver for Stealth Mode
The Undetected ChromeDriver is an optimized driver for bypassing anti-bot detection in Selenium. It removes anti-bot properties from Selenium, increasing your chance of bypassing fingerprinting tests during web scraping.
When used locally, the Undetected ChromeDriver handles all the top anti-bots, including DataDome, Cloudflare, Imperva, and Botprotect. However, it's generally ineffective against these anti-bots in a production environment.
Check out our article on Undetected ChromeDriver for a detailed tutorial.
Method #2: Use Premium Proxies
Proxies change your IP address, making the server think the request is coming from a different location. This technique can help you bypass DataDome's anti-bot measures, like IP bans caused by rate limiting.
Choose your proxies wisely. Websites like Free Proxy List offer free proxies, but they're unreliable for larger web scraping tasks due to their short life span.
The best option is to use premium web scraping proxies. Premium proxies generally require authentication with credentials like usernames and passwords, adding an extra layer of security and ensuring that only authorized users have access to the service.
Most premium proxy services also offer proxy rotation. It means your IP will be switched at specific intervals, enhancing your scraper's bypassing ability.
See our detailed tutorial on using a proxy with Selenium to learn more about applying this method.
It's advisable to complement this method with User Agent configuration to enhance your scraper's ability to evade blocks.
Method #3: Integrate a Web Scraping API Into Your Workflow
A web scraping API is a great solution for evading anti-bot detection during content extraction.
An example of such a tool is ZenRows. With ZenRows, you don't need to bother configuring proxies, optimizing your request headers, or rotating the User Agent manually because it deals with all of these automatically. It also successfully bypasses all CAPTCHAs and other anti-bot systems, including DataDome, Akamai, and Cloudflare, among many others.
What's more, ZenRows features JavaScript instructions, so it can act as a headless browser for scraping content from dynamic pages, e.g., using infinite scrolling. All these features make ZenRows a complete web scraping toolkit.
Let's use ZenRows to scrape the DataDome-protected website that blocked you previously to see how it works.
Sign up to open the ZenRows Request Builder. Paste the target URL in the link box, toggle on the Boost mode to JS Rendering, and activate Premium Proxies. Choose Python as your programming language and select the API connection mode. Copy and paste the generated code into your Python file.
Here's a slightly modified version of the generated code. It accesses the protected websites and extracts its full-page HTML:
# pip install requests
import requests
# specify the query parameters
params = {
"url": "https://www.bestwestern.com/",
"apikey": "<YOUR_ZENROWS_API_KEY>",
"js_render": "true",
"premium_proxy": "true",
}
# send the rquest
response = requests.get("https://api.zenrows.com/v1/", params=params)
# print the response data
print(response.text)
The output below shows the page title with omitted content:
<html lang="en-us">
<head>
<title>Best Western Hotels - Book Online For The Lowest Rate</title>
</head>
<body class="bestWesternContent bwhr-brand">
<header>
<!-- ... -->
</header>
<!-- ... -->
</body>
</html>
Congratulations! You've just bypassed a DataDome-protected website with ZenRows.
Method #4: Optimize Your User Agent
The User Agent describes the requesting browser or HTTP client to the server. Your scraper sends it with the request headers, and it details the client type, software vendor, and operating system version.
The User Agent is essential to web scraping because it helps the server differentiate between bots and real users. Selenium's default User Agent header contains bot-like strings, such as the "HeadlessChrome" in headless mode, making it more vulnerable to anti-bot detection.
Changing Selenium's User Agent lets you mimic a legitimate browser and appear as a real user, increasing your chance of bypassing DataDome's protection.
Check our detailed guide on customizing the Selenium User Agent to learn more.
Method #5: Go With Selenium Stealth
Selenium Stealth is a plugin for avoiding anti-bot detection during content extraction. It patches missing fingerprints, such as the User Agent, graphics renderer (WebGL), platform, and vendor, and replaces the WebDriver with a legitimate Chrome browser.
Pairing Selenium with the Stealth plugin can help you bypass DataDome's complex anti-bot measures like browser fingerprinting.
However, the plugin has a few limitations. First, it only supports Chrome. Another potential challenge is that it struggles with DataDome's machine learning-based protection, resulting in occasional blocks.
If Stealth alone isn't enough, you can enhance its bypassing ability by pairing it with the Undetected ChromeDriver. Check the tutorial on using Selenium Stealth for web scraping to learn more.
Conclusion
In this article, you've explored five methods of bypassing DataDome's protection with Selenium. Tools like the Undetected ChromeDriver and Selenium Stealth help remove some bot-like signals from Selenium. Proxy changes your location, while User Agent configuration lets you mimic a real user. These methods are more effective when combined.
However, in more advanced cases where these methods won't work, we recommend integrating ZenRows as the ultimate solution to bypass DataDome and other anti-bot systems at scale, regardless of their complexity. Try ZenRows for free!