How to Bypass Cloudflare with cURL in 2024

February 20, 2023 · 10 min read

One in five websites uses some form of Cloudflare protection, meaning there's a high chance you'll be blocked when attempting to scrape a website. But is there any solution? In this tutorial, you'll learn how to bypass Cloudflare with cURL. We'll discuss how pure cURL scrapers work and the tweaks you can make to get the data you want.

Ready? Let's dive in!

What Is Cloudflare?

Cloudflare is a company that provides some of the most popular web performance and security services. For us, the problem is its Web Application Firewall (WAF), which detects and blocks bots by default to mitigate malicious attacks.

Google and other search engines are on an allowlist, but cURL web scrapers aren't. So, regardless of your intent, a Cloudflare-protected website will identify you as a malicious bot and block you out.

How Do I Bypass Cloudflare in cURL?

A cURL connection has distinctive properties that differ from a real browser. So, when you send such a request, the system easily identifies you and denies you access.

Here's what else you can try:

Ideally, all you'd need to do is mimic a legitimate browser: randomizing your static HTTP headers or copying those of a real browser should grant you access. However, more than that is often needed in practice.

You'll also need to imitate natural user behavior! And here's where it gets tricky because defining that behavior from a request-based tool like cURL can prove challenging. Yet, there are special builds in cURL-impersonate that can imitate a real browser's TLS and HTTP handshakes (TLS and HTTP/2 are two passive bot detection techniques in Cloudflare's arsenal).

Let's get down to code.

Method #1: Base cURL

Let's run through a quick scraping example. We'll use cURL to target CoinTracker, a cryptocurrency tracking platform under Cloudflare's anti-bot protection.

cointracker-cloudflare-bypass
Click to open the image in full screen

We start by sending a request to our target website.

Example
curl https://www.cointracker.io/

_Italic_Expectedly, that didn't work.

Our cURL-based scraper returned raw HTML content containing the Cloudflare waiting room message: "Checking if the site connection is secure."

Output
<!DOCTYPE html>
<html lang="en-US">
<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <link href="/cdn-cgi/styles/challenges.css" rel="stylesheet">
 
 
</head>
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <h1 class="zone-name-title h1">
            <img class="heading-favicon" src="/favicon.ico"
                 onerror="this.onerror=null;this.parentNode.removeChild(this)">
            www.cointracker.io
        </h1>
        <h2 class="h2" id="challenge-running">
            Checking if the site connection is secure
        </h2>
        <noscript>
            <div id="challenge-error-title">
                <div class="h2">
                    <span class="icon-wrapper">
                        <div class="heading-icon warning-icon"></div>
                    </span>
                    <span id="challenge-error-text">
                        Enable JavaScript and cookies to continue
                    </span>
                </div>
            </div>
        </noscript>
        <div id="trk_jschal_js" style="display:none;background-image:url('/cdn-cgi/images/trace/managed/nojs/transparent.gif?ray=78a3c8ab2bbb0eac')"></div>
        <div id="challenge-body-text" class="core-msg spacer">
            www.cointracker.io needs to review the security of your connection before proceeding.
        </div>

The system detected us as a bot. You can learn more about the waiting room in our Cloudflare bypass guide.

Let's try using HTTP headers next.

Method #2: HTTP Headers

Cloudflare identified our scraper as a bot through the default cURL-specific headers. However, this could also work in our favor. By randomizing the headers, we can get closer to natural user behavior.

We'll start by viewing the current ones. We can do so by sending a request to httpbin, a website displaying requests and response headers.

Example
curl http://httpbin.org/headers

Here's the result:

Output
{
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin.org",
    "User-Agent": "curl/7.83.1",
    "X-Amzn-Trace-Id": "Root=1-63c80ee4-6374a4877ed23d752c571880"
  }
}

Now, look at what we get when we open the httpbin site in our browser:

Output
{
  "headers": {
	"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed- 
     exchange;v=b3;q=0.9",
	"Accept-Encoding": "gzip, deflate",
	"Accept-Language": "en-US,en;q=0.9",
	"Host": "httpbin.org",
	"Upgrade-Insecure-Requests": "1",
	"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
	"X-Amzn-Trace-Id": "Root=1-63c80f39-673b7c0273142fde3e5eaa06"
  }
}

As you can see, the cURL headers are entirely different from the browser ones. Thus, it's easy for Cloudflare to identify and block us.

Let's try using our browser headers in our cURL request:

Example
curl 'https://www.cointracker.io/' \
  -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" \
  -H "Accept-Encoding: gzip, deflate" \
  -H "Accept-Language: en-US,en;q=0.9" \
  -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36" \

And here's the outcome:

Output
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <h1 class="zone-name-title h1">
            <img class="heading-favicon" src="/favicon.ico"
                 onerror="this.onerror=null;this.parentNode.removeChild(this)">
            www.cointracker.io
        </h1>
        <h2 class="h2" id="challenge-running">
            Checking if the site connection is secure
        </h2>
        <!-- ... -->

Ouch! 😢 Still not working.

This proves that randomizing headers isn't enough to bypass Cloudflare in cURL.

However, there's something else we can do.

Method #3: Cookies

When a browser interacts with a website, the server sends cookies back to it. In a later request, the browser includes those in its headers to send back to the server. This way, they recognize each other.

In a new scraping attempt, we'll impose the target website's cookies in our cURL request. Remember, the idea is to mimic natural user behavior.

Visit CoinTracker in an actual browser, open the DevTools' network tab, and refresh the page.

bypass-cloudflare
Click to open the image in full screen

There, we can see the request responsible for the page we're trying to get. Clicking on the URL opens the headers tab, where we can find the request headers section.

bypass-cloudflare-cointracker
Click to open the image in full screen

Then, we have to right-click and copy the URL to make our request look like this:

Example
curl 'https://www.cointracker.io/' \
  -H 'Referer: https://www.google.com/' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Windows"' \

UnderlineWhat do you think? Will we be successful this time?

Output
<!DOCTYPE html>
<html lang="en-US">
<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <link href="/cdn-cgi/styles/challenges.css" rel="stylesheet">
</head>
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <h1 class="zone-name-title h1">
            <img class="heading-favicon" src="/favicon.ico"
                 onerror="this.onerror=null;this.parentNode.removeChild(this)">
            www.cointracker.io
        </h1>
        <h2 class="h2" id="challenge-running">
            Checking if the site connection is secure
        </h2>
        <!-- ... -->

Once again, we didn't succeed in bypassing Cloudflare with cURL and were led to the waiting room. 😢

This happens because the firewall uses multiple anti-bot techniques, and relying just on request cookies isn't enough in most cases.

Method #4: cURL-impersonate to Simulate a Browser

We'll use cURL-impersonate to imitate a real browser this time. Start by downloading and installing it.

Now, let's try to access our target website by simulating we're running Chrome104. On your command line tool, open the folder containing the package. Then, send the following request:

Example
curl_chrome104 --url https://www.cointracker.io/

Our result? See below!

Output
<p class="lead text-secondary">
CoinTracker generates your crypto tax forms in minutes with industry-leading accuracy.
</p>
</div>
<div class="d-xl-block d-none">
<p class="lead text-secondary text-left mb-4 w-100 col-10 px-0">
CoinTracker generates your crypto tax forms in minutes with industry-leading accuracy.
</p>
<p class="lead text-secondary text-xl-left">
<img src="https://s3-us-west-1.amazonaws.com/coin-tracker-public/static/images/sprites/check.svg" loading="lazy">
Connect 500+ wallets and exchanges instantly<br>
</p>
<p class="lead text-secondary text-xl-left">
<img src="https://s3-us-west-1.amazonaws.com/coin-tracker-public/static/images/sprites/check.svg" loading="lazy">
Best-in-class security<br>
</p>
<p class="lead text-secondary text-xl-left">
<img src="https://s3-us-west-1.amazonaws.com/coin-tracker-public/static/images/sprites/check.svg" loading="lazy">
One-click sharing with your accountant<br>
</p>
<p class="lead text-secondary text-xl-left">
<img src="https://s3-us-west-1.amazonaws.com/coin-tracker-public/static/images/sprites/check.svg" loading="lazy">
Trusted by 1M+ users<br>
</p>
<!-- ... -->

UnderlineBoom! Congratulations, you've done your first Cloudflare bypass in cURL!

However, without aiming to ruin your happiness, this won't work on the many websites that use the most advanced Cloudflare security level. Therefore, cURL-impersonate isn't reliable.

Let's prove that with an example. We'll try to access a G2's product page using the steps just seen.

zenrows-bypass-cloudflare-g2
Click to open the image in full screen

You can probably guess what happened: we got blocked!

Output
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <h1 class="zone-name-title h1">
            <img class="heading-favicon" src="/favicon.ico"
                 onerror="this.onerror=null;this.parentNode.removeChild(this)">
            www.g2.com
        </h1>
        <h2 class="h2" id="challenge-running">
            Checking if the site connection is secure
        </h2>
        <noscript>
            <div id="challenge-error-title">
                <div class="h2">
                    <span class="icon-wrapper">
                        <div class="heading-icon warning-icon"></div>
                    </span>
                    <span id="challenge-error-text">
                        Enable JavaScript and cookies to continue
                    </span>
                </div>
            </div>
        </noscript>
        <div id="trk_jschal_js" style="display:none;background-image:url('/cdn-cgi/images/trace/managed/nojs/transparent.gif?ray=78beef326be4d424')"></div>
        <div id="challenge-body-text" class="core-msg spacer">
            www.g2.com needs to review the security of your connection before proceeding.
        </div>
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

We know this is disappointing, but we have great news: there's light at the end of the tunnel!

When nothing works, it's probably a good idea to get some help from professionals. Next, we'll see a solution that makes bypassing any level of Cloudflare protection while cURL scraping a piece of cake.

Method #5: ZenRows to Bypass Cloudflare in cURL

ZenRows is a new-generation scraping library that helps you retrieve data from essentially any website (yes, Cloudflare-protected ones included). Let's see it in action against the heavily protected G2's page.

First of all, create an account on ZenRows to get your free API key and get access to the Request Builder page. There, copy-paste your target URL <https://www.g2.com/products/asana/reviews>

bypass-cloudflare-curl-zenrows
Click to open the image in full screen

You'll see this command line:

Terminal
curl -k "https://www.g2.com/products/asana/reviews" \
	-L -x "http://YOUR_API_KEY:@proxy.zenrows.com:8001" 

To bypass Cloudflare in cURL, simply check the boxes Premium Proxy and Antibot. This adds the &antibot=true and proxy_country parameters to your request.

Additionally, we'll add --output g2page.html to save the result in a file.

Example
curl -k "https://www.g2.com/products/asana/reviews" \
    -L -x "http://YOUR_API_KEY:antibot=true&premium_proxy=true&[email protected]:8001" \
    --output g2page.html

_Italic_That's it! Here the ZenRows magic comes. The scraper returns the web page content in no time.

zenrows-bypass-cloudflare-g2
Click to open the image in full screen

Finally! Executing Cloudflare bypass in cURL has never been easier.

Conclusion

Bypassing Cloudflare has become a key part of many data extraction projects, including those using cURL. But relying on pure cURL isn't enough, as well as tweaks like HTTP headers, and even cURL-impersonate falls short.

The most trustable solution proved to be ZenRows, which helps you to get around any anti-bot protections. Get your free API key now and try it yourself.

Ready to get started?

Up to 1,000 URLs for free are waiting for you