Cloudflare JS Challenge: How It Works and How to Solve It

Idowu Omisola
Idowu Omisola
October 24, 2024 · 9 min read

Is your scraper getting stuck on Cloudflare’s “Just a moment...” page? That usually means it’s failing the JavaScript challenge, a common defense used to block bots.

We’ll break down how the challenge works and show you two effective ways to bypass it using Python.

What Is Cloudflare JS Challenge?

scrapingcourse cloudflare blocked screenshot
Click to open the image in full screen

The Cloudflare JavaScript challenge runs in the background when you visit a protected site. It shows a brief loading screen while checking for bots through headers, IP reputation, and browser behavior. Scrapers often get stuck here if they use outdated tools or don’t mimic real browsers well.

How Is Cloudflare JS Challenge Different Than Other Challenges?

Unlike visible CAPTCHAs that ask users to solve puzzles, Cloudflare’s JavaScript challenge runs silently in the background. It doesn’t require any interaction but still verifies that a real browser is being used.

Cloudflare also uses a more advanced protection called a Managed Challenge. Unlike the standard JavaScript challenge, this one adjusts its response based on the details of each request.

Instead of a pre-determined challenge, the Managed one determines the type and complexity of the challenge a user gets based on the characteristics of an incoming request. It may present a non-interactive or interactive type requiring clicking a checkbox.

Breaking Down the Cloudflare JS Challenge

This section will detail the moving parts of Cloudflare's JS challenge so you can understand what your web scraper is up against.

JavaScript Execution

When you visit a Cloudflare-protected website, the security measure injects a JavaScript file into your browser and prompts it to execute that script within a specified timeframe. You're now on an interstitial page waiting for the browser to complete this action.

Once completed, you gain access to the target website and scrape the data you want. Otherwise, Cloudflare's bot management system keeps you on the interstitial page until you pass the challenge.

Environment and Fingerprinting

While your browser runs the challenge script, Cloudflare scans various parts of your browser environment to create a unique fingerprint, similar to how PerimeterX and Imperva (formerly Incapsula) detect automated traffic. Initially, it checks critical request header properties like the User Agent for missing or incomplete information.

For instance, a mismatch between your User Agent platform and the client hint platform can cause your scraper to fail Cloudflare's JavaScript challenge.

The challenge also probes the browser for other fingerprints, including navigator.webdriver, navigator.pdfViewerEnabled, navigator.plugins, WebGL support, Canvas rendering capability, and more. Any missing information can give you away as a bot and prevent your scraper from scaling the interstitial page.

Timing and Behavioral Analysis

Cloudflare's JS challenge goes beyond fingerprinting. Cloudflare can also analyze behavioral patterns, such as cursor movement, clicking and scrolling behavior, to determine if the user is a bot. 

Your scraper can fail the JavaScript challenge if it exhibits unusual behavior, such as a stationary cursor or filling out a form faster than usual. Behavioral checks are common when the user has to pass a Cloudflare challenge before taking specific actions like submitting a form.

See an example below:

Form Challenge Example
Click to open the image in full screen
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Cookies

Cloudflare typically uses cookies to track session activities during the JavaScript challenge. When your request passes the JS challenge, Cloudflare's bot management system sets a cookie, such as cf_clearance, to mark your browser as verified. It then checks for that set cookie in subsequent sessions to determine the client's or browser's legitimacy.

We can demonstrate how a browser handles Cloudflare's cookie using a simple request from Python's Requests library. 

You first create a session with a cookie and access a website with that session. Then, you persist that session for subsequent requests to the same website. 

Let's see how that works using the same session cookie across three requests to https://httpbin.io/cookies, a test website that returns your current cookie:

Example
# pip3 install requests
import requests

# create a session
session = requests.Session()

# specify the target website
url = "https://httpbin.io/cookies"

# set the desired cookie
session.cookies.set("test_cookie", "cookie_value")

# send the request with the cookie in the header
response = session.get(url)
print("Response after the first request:")
print(response.json())

# send the second request and check if it uses the previous cookie
response = session.get(url)
print("\nResponse after the second request:")
print(response.json())

# send the third request and check if it persists the cookie
response = session.get(url)
print("\nResponse after the third request:")
print(response.json())

The code outputs the same cookie for all three requests, confirming that the cookie spans throughout the request sessions:

Output
Response after the first request:
{'test_cookie': 'cookie_value'}

Response after the second request:
{'test_cookie': 'cookie_value'}

Response after the third request:
{'test_cookie': 'cookie_value'}

Tracking the session cookie allows Cloudflare to skip future challenges for verified users. However, if the cookie hasn't expired but is missing in subsequent requests, Cloudflare may assume the request is suspicious and block it. 

Generally, Cloudflare's cookies aim to reduce friction for legitimate users who may revisit the website within the cookie's validity period. 

IP Address

Some websites explicitly block specific IPs due to regional preference, low IP reputation, or as part of other security measures. Depending on the website's rules, Cloudflare sometimes challenges the traffic coming from a client to determine whether its IP address is on the allowed or disallowed list.

An IP that broke previous rate-limiting rules may also enter the list of disallowed IPs. Bots typically send multiple requests from the same IP, making it easy for Cloudflare to detect them via the JavaScript challenge.

After running the JS challenge, Cloudflare may block traffic from your machine if the browser reports that your IP has a low reputation or is from a blocked region.

Redirection

Redirection is one of the ways Cloudflare tests the patience of automated requests. After successfully solving the JavaScript challenge, Cloudflare redirects the client to the target website. This redirection acts as an additional integrity check, increasing the overall time required to complete the challenge process.

So, even if a scraper or bot successfully solves the JavaScript challenge, it may terminate an ongoing request before the target website fully loads. This immature termination can lead to failed scraping operations or the retrieval of incorrect data.

Now that you understand how Cloudflare's JavaScript challenge works, how can you ensure your scraper solves it efficiently to avoid getting blocked?

How to Solve the Cloudflare JS Challenge

You'll now learn two proven ways to solve the Cloudflare JavaScript challenge in Python. 

We'll demonstrate the effectiveness of each method using the Cloudflare challeng page.

Use SeleniumBase With Python

You can only pass the JavaScript challenge with JavaScript enabled. So, regular HTTP clients, such as Python's Requests or Node.js' Axios, won't work. 

You need a scraping tool that can execute JavaScript and bypass the Cloudflare JavaScript challenge. That's where browser automation libraries come in handy. 

However, plain headless browsers like Selenium and Playwright won't work because they leak bot-like properties, such as the presence of an automated WebDriver, a HeadlessChrome User Agent in headless mode, missing renderers, and more.

SeleniumBase is one of the best open-source options for bypassing Cloudflare with Selenium. It's a browser automation library that allows you to plug in the Undetected ChromeDriver to run Selenium in stealth mode and appear as a human.

To start, install Seleniumbase using pip:

Terminal
pip3 install seleniumbase

Let's use it to access the Cloudflare challenge page, an example of a Cloudflare-protected site that triggers the JavaScript challenge.

Create a driver instance that runs with the Undetected ChromeDriver in non-headless (GUI) mode. Then, visit the target site and take a screenshot. 

The code uses the SeleniumBase built-in uc_gui_click_captcha() method to check the presence of a forced Cloudflare CAPTCHA and clicks it (CAPTCHA clicking only works in the GUI mode):

Here's a script to do that:

Example
# import the Driver class from seleniumbase
from seleniumbase import Driver

# initialize driver with UC mode enabled in GUI mode
driver = Driver(uc=True, headless=False)

# set target URL
url = "https://www.scrapingcourse.com/cloudflare-challenge"

# open URL using UC mode with 6 second reconnect time to bypass initial detection
driver.uc_open_with_reconnect(url, reconnect_time=6)

driver.sleep(10)

# attempt to bypass CAPTCHA if present using UC mode's built-in method
driver.uc_gui_click_captcha()

# take a screenshot of the current page and save it
driver.save_screenshot("cloudflare-challenge.png")

# close the browser and end the session
driver.quit()

The code returns the page screenshot, confirming that SeleniumBase passed the JavaScript challenge step:

cloudflare-challenge-passed
Click to open the image in full screen

That works! However, running SeleniumBase in non-headless mode works because you can only simulate CAPTCHA clicking in the GUI mode. It doesn't work in headless mode, especially when dealing with advanced Cloudflare implementation that forces you to click the CAPTCHA box.

Besides, using SeleniumBase is unsuitable for scraping multiple websites because running the browser instance results in significant memory overhead. It's also open-source and can't keep up with Cloudflare's frequent security updates.

Fortunately, there's a way out of all these limitations. Find out below!

Use Scraper API to Solve Cloudflare JS Challenge

A scraper API is a tool that helps you handle web scraping tasks, including bypassing anti-bot measures behind the scenes. 

One of the top solutions is the ZenRows scraper API, an all-in-one scraping toolkit featuring anti-bot auto-bypass with advanced JavaScript rendering support to pass Cloudflare's JavaScript challenge. 

ZenRows also helps you manage other tasks, such as premium proxy rotation, geolocation, advanced fingerprint spoofing, and more. With these features, you can focus on core scraping logic while ZenRows handles the Cloudflare challenge under the hood. All it takes is a single API call. 

Let's see how the ZenRows scraper API works by scraping the Cloudflare challenge page.

Sign up to open the ZenRows Request Builder. Paste the target URL in the link box, and activate Premium Proxies and JS Rendering. Choose Python as your programming language and select the API connection mode.

Copy and paste the generated code into your Python file.

building a scraper with zenrows
Click to open the image in full screen

The generated Python code should look like this:

Example
# pip install requests
import requests

url = "https://www.scrapingcourse.com/cloudflare-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
    "url": url,
    "apikey": apikey,
    "js_render": "true",
    "premium_proxy": "true",
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)

Run the above code, and you'll get the protected website's full-page HTML:

Output
<html lang="en">
<head>
    <!-- ... -->
    <title>Cloudflare Challenge - ScrapingCourse.com</title>
    <!-- ... -->
</head>
<body>
    <!-- ... -->
    <h2>
        You bypassed the Cloudflare challenge! :D
    </h2>
    <!-- other content omitted for brevity -->
</body>
</html>

Easy-peasy🎉! You just bypassed Cloudflare's JS challenge using the ZenRows scraper API. Beyond Cloudflare, the scraper API has powerful evasion capabilities to bypass any web application firewall at scale.

Conclusion

You've learned how Cloudflare's JS challenge works, including two solid methods to bypass it. While an open-source solution like SeleniumBase is cheaper and can work for simple protections, it can't bypass advanced Cloudflare security. Plus, it has high memory demand limitations, which makes it unscalable. 

We recommend using ZenRows to avoid these limitations. It has all the anti-bot bypass features to reliably and efficiently scale through any anti-bot challenge at scale. ZenRows is also fast and requires simple resources, making it highly scalable.

Try ZenRows for free now without a credit card!

Ready to get started?

Up to 1,000 URLs for free are waiting for you