Are you getting blocked by Cloudflare while scraping with C#? Cloudflare protection is a common blocker during web scraping. But no worries. At least one of the methods in this article will help you bypass it.
This article shows you two tested ways to bypass Cloudflare in C#. In each case, we'll bypass the Cloudflare Challenge page to demonstrate their strengths.
Ready to bypass Cloudflare? Let's go!
Can Cloudflare Detect C# Scrapers?
Yes, Cloudflare can detect and block C# scrapers. Here's why.
Cloudflare is a content delivery and web security service and one of the internet's most popular web application firewalls (WAFs). Its data center network acts as a reverse proxy for websites. So, when you visit a protected website, your request routes through Cloudflare's network before reaching the origin server.
This routing system allows Cloudflare to vet requests for bot-like activities, including web scraping. Unfortunately, Cloudflare frequently updates its security measures, making it difficult for C# scraping tools to bypass.
For example, a protected site like the Cloudflare Challenge page will block a C# cURL scraper. Try it with the code below using the RestSharp
package:
// dotnet add package RestSharp
using RestSharp;
namespace SimpleWebScraper
{
class Scraper
{
static void Main(string[] args)
{
// create a new RestClient instance with the given URL
var client = new RestClient("https://www.scrapingcourse.com/cloudflare-challenge");
// create a new RestRequest instance
var request = new RestRequest();
// send the request and get the response
var response = client.Get(request);
// print the response content to the console
Console.WriteLine(response.Content);
}
}
}
The scraper gets blocked with a Cloudflare 403 forbidden error:
Request failed with status code Forbidden
You'll often encounter the above error when accessing Cloudflare-protected sites with a C# HTTP client. Cloudflare's bot detection techniques include active and passive measures to strengthen a website's server and client-side firewalls and make it inaccessible to automated scripts.
Some of its common detection measures include browser fingerprinting, behavioral analysis, request header analysis, IP reputation, machine learning, and more.Â
A few Bot Management errors you might run into while trying to scrape a Cloudflare-protected web page in C# include:
- Error 1010.
- Error 1012.
- Error 1020.
These errors often result in a 403 Forbidden status code during scraping. However, you can bypass Cloudflare in C# using the right approach. Let's see the solutions to achieve it below.
1. ZenRows
The easiest and most effective way to evade Cloudflare in C# is to use ZenRows, a web scraping API that bypasses all major anti-bots at scale. With a single API call, ZenRows helps you handle fingerprinting, proxy management, JavaScript rendering, anti-bot and CAPTCHA auto-bypass, and more.Â
It also has headless browser features for interacting with dynamic web pages and is compatible with any programming language.
đź‘Ť Pros
- Easy to use.
- Compatible with any programming language.
- Bypass any anti-bot measure at scale.
- Headless browsing feature.
- Screenshot support.
- It requires only a single API call.
- It's the best solution to scrape data at scale with little effort.
- It offers integration with scraping libraries and workflow tools like Make.
- An intuitive request dashboard to monitor requests and usage statistics.
đź‘Ž Cons
- It's a paid solution but offers a free trial and only charges for successful requests.Â
How to Bypass Cloudflare in C# Using ZenRows
To use ZenRows with C#, we'll bypass the Cloudflare protection on the previous website that blocked us (Cloudflare Challenge page).
To start, sign up to open the ZenRows Request Builder. Paste the target URL in the link box and activate Premium Proxies and JS Rendering. Select C# as your programming language and choose the API connection mode. Copy and paste the generated code into your C# scraper.
The generated code should look like this:
// dotnet add package RestSharp
using RestSharp;
namespace TestApplication
{
class Test
{
static void Main(string[] args)
{
var client = new RestClient("https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.scrapingcourse.com%2Fcloudflare-challenge&js_render=true&premium_proxy=true");
var request = new RestRequest();
var response = client.Get(request);
Console.WriteLine(response.Content);
}
}
}
The above code returns the protected website's full-page HTML, proving that our scraper scaled through the Cloudflare firewall:
<html lang="en">
<head>
<!-- ... -->
<title>Cloudflare Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Cloudflare challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Congratulations 🎉! Your C# scraper now bypasses Cloudflare using ZenRows.
2. Puppeteer-Sharp
Puppeteer-Sharp is another method used in C# to bypass Cloudflare. It's a NetStandard 2.0 headless Chrome, and the minimum platform versions are .NET Framework 4.6.1 and .NET Core 2.0.
As a browser automation tool, Puppeteer-Sharp lets you mimic real users' behavior and can increase your chance of bypassing simpler Cloudflare protections. However, Puppeteer-Sharp leaks bot-like attributes, such as the WebDriver property, which exposes your scraper as a bot. So, it doesn't guarantee success in most cases.
đź‘Ť Pros
- Headless browsing functionality.Â
- Suitable for bypassing low protections.
- Screenshot support.
- Suitable for scraping dynamic pages.
đź‘Ž Cons
- It easily gets blocked by Cloudflare.
- It's memory inefficient.
- Unsuitable for large-scale web scraping.
How to Bypass Cloudflare in C# Using Puppeteer-Sharp
To see how Puppeteer-Sharp works, we'll try it with the previous protected page (the Cloudflare Challenge page) and take a screenshot.Â
First, install the package using dotnet
:
dotnet add package PuppeteerSharp --version 20.0.0
Now, create a new PuppeteerScraper
class in your project folder. Write the following code in the class file. The code spins a browser instance in the GUI mode (non-headless) using the local Chrome executable. It then opens the target web page and saves its screenshot.
// dotnet add package PuppeteerSharp --version 20.0.0
using PuppeteerSharp;
namespace SimpleWebScraper
{
public class PuppteerScraper
{
// the main entry point of the program
public static void Main(string[] args)
{
// call the async method to check headless chrome and wait for its completion
CheckingHeadLessChrome().Wait();
}
// method to check the headless chrome functionality
public static async Task CheckingHeadLessChrome()
{
// set the output file name for the screenshot
string outputFile = "screenshot.png";
// fetch and download the Chrome browser if not already available
var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync();
//launch Chrome with specified options such as headless mode and executable path
var options = new LaunchOptions()
{
// path to chrome executable
ExecutablePath = @"C:\Program Files\Google\Chrome\Application\chrome.exe",
// run Chrome in headless mode
Headless = false,
// slow down the automation by 10ms to observe the actions
SlowMo = 10
};
// open a new instance of Chrome browser with the specified options
await using var browser = await Puppeteer.LaunchAsync(options);
// open a new page (tab) in the browser
await using var page = await browser.NewPageAsync();
// navigate to the specified url
await page.GoToAsync("https://www.scrapingcourse.com/cloudflare-challenge");
// get the full html content of the page
var allContent = await page.GetContentAsync();
// take a screenshot of the page and save it to the specified output file
await page.ScreenshotAsync(outputFile);
}
}
}
Unfortunately, the scraper gets blocked by Cloudflare, even after several requests:
Despite running the browser in GUI mode to increase the success rate, our C# scraper still got blocked. The result is the same in headless mode.
Open-source tools such as Puppeteer-Sharp appear as bots, and Cloudflare easily blocks them. Even if they manage to bypass a low-protection website by chance, it can't hold for long because Cloudflare will detect and block it after a few more requests.Â
The only reliable solution to bypass Cloudflare at scale is via an all-in-one web scraping API like ZenRows.
Conclusion
In this article, you've learned to bypass Cloudflare while scraping with C#. You've seen a paid and an open-source solution. The open-source option isn't reliable because it presents bot-like attributes that Cloudflare easily detects.
We recommend using ZenRows to bypass Cloudflare while scraping with C#. It's one of the best solutions for bypassing anti-bots and scraping any website without limitations.Â
Try ZenRows for free now without a credit card!