The cf_clearance cookie isn't just any cookie, it's the access key to Cloudflare-protected websites. You only need to obtain that cookie and use it in your scraping request to bypass Cloudflare. But how can you go about it?
We'll show you how to extract the cf_clearance cookie from a Cloudflare-protected site with the CF-Clearance-Scraper. You'll then learn how to use this cookie to bypass Cloudflare in your scraper.
Understanding Cloudflare Protection and cf_clearance Cookies
While Cloudflare usually blocks bots with a CAPTCHA box, it analyzes several other request parameters under the hood to detect bots. These include a client's fingerprints, JavaScript solving ability, behavioral patterns, IP address, network traffic logs, and many other data points that can tell whether or not a request is bot-like.
The rule is that a request must pass these challenges before it can access the site behind the Cloudflare firewall. Once passed, it gets a clearance cookie called the cf_clearanc
, which serves as a pass key to the target site. The request must provide this cf_clearance
cookie within the same browsing session. Otherwise, it gets blocked.
Since cf_clearance
is bound to a session, a client must maintain the same User Agent and IP address used to obtain the cookie throughout the session. If you use an IP address or a User Agent different from the one used to obtain the cookie, Cloudflare will block your scraper instantly.
Standard HTTP clients like Python's Requests library get blocked because they can't solve Cloudflare's initial challenges. And that means they won't get the cf_clearance
cookie required to access the target site during scraping.
However, you can manually obtain the cf_clearance
cookie and pass it to a request within a session to bypass Cloudflare. Let's see how that works in the next section.
How to Scrape and Use cf_clearance Cookies
You'll learn how to scrape the cf_clearance
cookie with CF-Clearance-Scraper
, a command-line tool for retrieving the cf_clearance
from a Cloudflare-protected website.
You'll then use the scraped cookie to bypass Cloudflare protection with the Requests library.
Let's start with the installation step.
Step 1: Requirements and Installation
The CF-Clearance-Scraper
package only supports Python 3.10+. It also runs a headless Chrome instance under the hood. So, ensure you install the latest Python and Chrome versions on your machine if you've not done so already.
To install CF-Clearance-Scraper
, clone its repository with git
:
git clone https://github.com/Xewdy444/CF-Clearance-Scraper
Migrate into the CF-Clearance-Scraper
folder and install the requirements
using pip
:
cd CF-Clearance-Scraper
pip3 install requirements.txt
Let's now see how the tool works.
Step 2: Understanding CF-Clearance-Scraper Parameters
Running CF-Clearance-Scraper
requires executing the main.py
file with a compulsory URL parameter and other optional ones.
We've explained the main parameters below:
Parameter | Role |
---|---|
-f |
The output JSON file name to write the scraped cookies |
-t |
It sets the request timeout (in seconds) for retrieving the cookies |
-p |
Specifies the proxy URL |
-ua |
The User Agent header used in the cf_clearance scraping request |
--disable-http2 |
Disables HTTP/2 protocol |
--disable-http3 |
Disables HTTP/3 protocol |
-ac |
Save all cookies in addition to cf_clearance |
URL | The URL of the Cloudflare-protected site |
CF-Clearance-Scraper
works better with a User Agent and a proxy. So, in this tutorial, we'll use it with a Chrome User Agent and a proxy.
Here's what the CF-Clearance-Scraper
command structure looks like:
python main.py -p <PROXY_ADDRESS> -t <TIMEOUT_IN SECONDS> -ua "<USER_AGENT_STRING>" -f <COOKIE_FILE.json> <TARGET_URL>
Let's see this command in action in the next step.
Step 3: Scraping cf_clearance Cookie
You'll scrape the cf_clearance
cookie from the Cloudflare Challenge page using the User Agent, proxy, and timeout (60 seconds) parameters. You'll then write the proxies to a cookies.json
. To achieve this, run the main.py
file with these parameters as shown:
python main.py -p http://190.58.248.86:80 -t 60 -ua "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36" -f cookies.json https://www.scrapingcourse.com/cloudflare-challenge
We've used a free proxy from Free Proxy List, and it may not work at the time of reading this article. Please grab a new one from the site. Even better, use a web scraping premium proxy for the best results.
The above command returns the cf_clearance
cookie and writes all scraped cookies to a cookies.json
file within the CF-Clearance-Scraper
project folder.
# ...
[12:40:42] [INFO] Cookie: cf_clearance=KkssR4xQ9xEJwlNtUXQEKkoQl...lgI5
Since you'll use this cookie in a scraping request, running the above command via a Python subprocess is best. You can then retrieve the cf_clearance
string from the terminal logs using regex.
Create and open a new scraper.py
file inside the CF-Clearance-Scraper
folder. Import the subprocess
and re
libraries and create a cf_clearance_scraper
function. This function accepts the URL, proxy, and User Agent parameters. Input the commands into an array with the required parameters. Run the command with subprocess
and extract the cf_clearance
value from the logs using regex:
import subprocess
import re
def cf_clearance_scraper(url, proxy, user_agent):
# specify the command and pass the parameters
command = [
"python",
"main.py",
"-p",
proxy,
"-t",
"60",
"-ua",
user_agent,
"-f",
"cookies.json",
url,
]
try:
# run the command and capture output
process = subprocess.run(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
)
output = process.stdout
# extract the cf_clearance value from the logs with regex
match = re.search(r"cf_clearance=([^\s]+)", output)
if match:
cf_clearance = match.group(1)
return cf_clearance
else:
return None
except Exception as e:
print(f"Error running the command: {e}")
return None
Specify the target URL, User Agent, and proxy address. Then pass these values to the cf_clearance_scraper
function to execute it:
# ...
# define the target URL, User Agent, and proxy address
target_url = "https://www.scrapingcourse.com/cloudflare-challenge"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"
proxy = "http://190.58.248.86:80"
# get the cf_clearance cookie
cf_clearance = cf_clearance_scraper(target_url, proxy, user_agent)
print(cf_clearance)
Combine both snippets at this point, and you'll get the following full code:
import subprocess
import re
def cf_clearance_scraper(url, proxy, user_agent):
# specify the command and pass the parameters
command = [
"python",
"main.py",
"-p",
proxy,
"-t",
"60",
"-ua",
user_agent,
"-f",
"cookies.json",
url,
]
try:
# run the command and capture output
process = subprocess.run(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
)
output = process.stdout
# extract the cf_clearance value from the logs with regex
match = re.search(r"cf_clearance=([^\s]+)", output)
if match:
cf_clearance = match.group(1)
return cf_clearance
else:
return None
except Exception as e:
print(f"Error running the command: {e}")
return None
# define the target URL, User Agent, and proxy address
target_url = "https://www.scrapingcourse.com/cloudflare-challenge"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"
proxy = "http://190.58.248.86:80"
# get the cf_clearance cookie
cf_clearance = cf_clearance_scraper(target_url, proxy, user_agent)
print(cf_clearance)
The above code writes cookies to a cookies.json
file and outputs the cf_clearance
cookie:
MKybX880PCu.GfWLhonkBnG64WBs4ASAXeZ...Tux0eDI
Let's use that cf_clearance
cookie to bypass Cloudflare.
Step 4: Using the cf_clearance Cookie in Your Scraper
Now that your scraper extracts the cf_clearance
cookie, you'll use it to bypass Cloudflare by persisting it throughout a single scraping session. Let's extend the previous code to achieve this.
Add the Requests library to your imports. You don't need to install the Requests library separately because it's been installed with the CF-Clearance-Scraper
requirements.
Create a scraper
function that accepts the URL, cf_clearance
, proxy, and User Agent parameters. Instantiate a request session and set it up with the cf_clearance
cookie, User Agent, and proxy parameters. Then, request the target URL with the session object:
# ...
import requests
# ...
def scraper(url, cf_clearance, proxy, user_agent):
# create a persistent session
session = requests.Session()
# set the cf_clearance cookie in the session
session.cookies.set("cf_clearance", cf_clearance)
# set the same User-Agent header
session.headers.update({"User-Agent": user_agent})
# set the proxy for the session
session.proxies.update(
{
"http": proxy,
"https": proxy,
}
)
# request the target URL
try:
response = session.get(url)
response.raise_for_status()
return response.text
except requests.exceptions.RequestException as e:
return f"Error making request: {e}"
Next, execute the scraper function only if cf_clearance
exists. Add this below the cf_clearance_scraper
function execution:
# ...
if cf_clearance:
# use the cf_clearance cookie in a request
response = scraper(target_url, cf_clearance, proxy, user_agent)
print(response)
else:
print("Failed to retrieve cf_clearance cookie. Exiting.")
Combine the snippets in this section with the previous ones. Here's the complete code:
import subprocess
import re
import requests
def cf_clearance_scraper(url, proxy, user_agent):
command = [
"python",
"main.py",
"-p",
proxy,
"-t",
"60",
"-ua",
user_agent,
"-f",
"cookies.json",
url,
]
try:
# run the command and capture output
process = subprocess.run(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
)
output = process.stdout
# regex to extract the cf_clearance value
match = re.search(r"cf_clearance=([^\s]+)", output)
if match:
cf_clearance = match.group(1)
return cf_clearance
else:
return None
except Exception as e:
print(f"Error running the command: {e}")
return None
def scraper(url, cf_clearance, proxy, user_agent):
# create a persistent session
session = requests.Session()
# set the cf_clearance cookie in the session
session.cookies.set("cf_clearance", cf_clearance)
# set the same User-Agent header
session.headers.update({"User-Agent": user_agent})
# set the proxy for the session
session.proxies.update(
{
"http": proxy,
"https": proxy,
}
)
# request the target URL
try:
response = session.get(url)
response.raise_for_status()
return response.text
except requests.exceptions.RequestException as e:
return f"Error making request: {e}"
# define the target URL, User Agent, and proxy address
target_url = "https://www.scrapingcourse.com/cloudflare-challenge"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"
proxy = "http://190.58.248.86:80"
# get the cf_clearance cookie
cf_clearance = cf_clearance_scraper(target_url, proxy, user_agent)
if cf_clearance:
# use the cf_clearance cookie in a request
response = scraper(target_url, cf_clearance, proxy, user_agent)
print(response)
else:
print("Failed to retrieve cf_clearance cookie. Exiting.")
The above scraper may occasionally bypass Cloudflare. You might need to run it several times to get a successful result.
We recommend using a rotating proxy service that supports sticky sessions for a higher success rate. We'll discuss more about this below.
A successful execution outputs the protected site's full-page HTML:
<html lang="en">
<head>
<!-- ... -->
<title>Cloudflare Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Cloudflare challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
To maintain the session's integrity and increase the chance of success, ensure you retain the User Agent and proxy parameters you used to obtain the cf_clearance
cookie.
If using a rotating proxy, opt for a service that supports sticky sessions, like the ZenRows Residential Proxies. This feature allows you to pause proxy rotation and maintain a single IP for a specified period. Set the sticky period within a reasonable limit to persist the IP throughout the request session.
For instance, with a ZenRows sticky session (ttl
) of 1 minute, your proxy URL becomes:
proxy = "http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>_ttl-1m_session-YPMlnbAybl5q@superproxy.zenrows.com:1337"
For more information, check ZenRows' sticky session documentation.
While CF-Clearance-Scraper
can help you bypass Cloudflare, its low success rate means it's unreliable. Let's see a solution that works all the time.
Avoid Getting Blocked
Despite being a useful tool for bypassing Clodflare,CF-Clearance-Scraper
has some limitations that make it unreliable. First, the cf_clearance
cookie is IP-based, so the cookie becomes invalid if the IP used to solve the challenge changes.
The fact that the tool relies on a Chrome instance under the hood makes it unsuitable for large-scale scraping because it takes up a lot of memory.
Another limitation of CF-Clearance-Scraper
is that the cf_clearance
cookie can expire mid-scraping. This results in broken executions, incomplete data, or compromised data pipelines. Besides, Cloudflare consistently updates its security measures to clamp down on open-source solutions like CF-Clearance-Scraper
. This can render the tool obsolete in the long run.
The best way to scrape any website reliably without getting blocked is to use a web scraping solution like the Zenrows Universal Scraper API. ZenRows has a 99.93% success rate, ensuring you get all the data you need at scale without limitations.
With ZenRows, you no longer need to manage the cf_clearance
cookie or worry about expiring sessions. It consistently adapts to evolving Cloudflare security measures with zero effort from you. That way, you can focus on other business logic rather than wasting time and resources fixing missing or broken data pipelines.
Let's see how ZenRows' Universal Scraper API works by scraping the previous Cloudflare Challenge page.
Sign up and go to the Request Builder. Then, paste the target URL in the link box and activate Premium Proxies and JS Rendering.

Select Python as your programming language and choose the API connection mode. Copy the generated code and paste it into your scraper.
The generated Python code should look like this:
# pip3 install requests
import requests
url = "https://www.scrapingcourse.com/cloudflare-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
"url": url,
"apikey": apikey,
"js_render": "true",
"premium_proxy": "true",
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)
The above code bypasses the anti-bot challenge and outputs the protected site's full-page HTML, as shown:
<html lang="en">
<head>
<!-- ... -->
<title>Cloudflare Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Cloudflare challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Congratulations! 🎉 Your scraper now bypasses Cloudflare consistently. No more intermittent failures and missing data.
Conclusion
You've learned to extract and use the cf_clearance cookie to bypass Cloudflare during scraping. You used the CF-Clearance-Scraper to scrape the cookie from a Cloudflare-protected site and deployed it as an access key on the same site.
That said, CF-Clearance-Scraper still fails on several occasions due to some limitations. To avoid unpredictable scraping outcomes and extract data confidently, we recommend using ZenRows, an all-in-one scraping solution for all enterprise levels.