The main challenge you encounter when web scraping in C# using HttpClient is getting blocked by websites. That often happens because HttpClient uses default request headers that flag you as a bot, and the User-Agent string is the most critical one.Â
In this tutorial, you'll learn how to avoid detection by setting a custom C# HttpClient User Agent.
What Is the HttpClient User Agent?
HTTP request headers are vital pieces of information sent along with every HTTP request. They convey various details, and one of its key elements is the User-Agent (UA) string.Â
The UA serves as an identifier for the client making the request, informing the web server of its application, version, device, and even operating system. Here's a sample:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36
It reveals that the client is using the Chrome browser version 111 on a 64-bit Windows 10 operating system, among other information.
But in your scraper, your C# HttpClient User Agent tells the web server that you're not requesting from an actual browser, as it's an empty string by default.Â
{
"user-agent": ""
}
Sadly, that makes it easy for websites to detect and block your scraper. The good news is you can avoid detection by customizing the User-Agent header in your HttpClient requests to mimic real user behavior. Let's see how.
How to Set a Custom User Agent in HttpClient
Here's the step-by-step process to set a C# HttpClient User-Agent.
1. Getting Started
Let's start with a starting script that makes an HTTP request to a target website and retrieves its HTML content.
We created an HttpClient instance, used to make a GET
request to httpbin (an API that returns the web client's User Agent string), which retrieves and prints its HTML content.
using System;
using System.Net.Http;
class Program
{
static async Task Main()
{
// Create an instance of HttpClient
HttpClient httpClient = new HttpClient();
// Make a GET request to httpbin.io/user-agent
HttpResponseMessage response = await httpClient.GetAsync("https://httpbin.io/user-agent");
// Read and print the content
string content = await response.Content.ReadAsStringAsync();
Console.WriteLine(content);
}
}
The result should be your current, empty, User Agent.
{
"user-agent": ""
}
If you'd like to review your web scraping fundamentals, check out our C# web scraping guide.
2. Customize UA
The DefaultRequestHeaders
property in HttpClient allows you to set a custom User Agent using the `Add()` method, as in the code snippet below. We'll use the real UA you saw earlier.
string customUserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36";
httpClient.DefaultRequestHeaders.Add("User-Agent", customUserAgent);
Now, set a custom User Agent in the HTTP request script we created earlier, and you'll have the following complete code.
using System;
using System.Net.Http;
class Program
{
static async Task Main()
{
// Create an instance of HttpClient
HttpClient httpClient = new HttpClient();
// Set custom User Agent
string customUserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36";
httpClient.DefaultRequestHeaders.Add("User-Agent", customUserAgent);
// Make a GET request to httpbin.io/user-agent
HttpResponseMessage response = await httpClient.GetAsync("https://httpbin.io/user-agent");
// Read and print the content
string content = await response.Content.ReadAsStringAsync();
Console.WriteLine(content);
}
}
Run the code, and your response should be the predefined custom User Agent.
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36"
}
Congrats! You've set an HttpClient User Agent in C#.
However, with a single UA, websites can still easily detect your scraper. Therefore, you must rotate the string to get the best results. Let's see how.
3. Use a Random User Agent in HttpClient
Rotating the HttpClient User-Agent in C# is critical to avoid getting blocked while web scraping, as too many requests from the same User-Agent can be flagged as suspicious activity. You must randomize your User Agent to mimic being different users, making it difficult for websites to detect your scraping activities.
To rotate your UA, start by defining a list of them. For this tutorial, we've taken a few from our list of top User Agents for web scraping.
using System;
using System.Net.Http;
class Program
{
static async Task Main()
{
// Create an instance of HttpClient
HttpClient httpClient = new HttpClient();
// Define your User Agent List
List<string> userAgents = new List<string>
{
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36",
};
//..
}
}
Next, generate a random index and use it to select a User Agent from your list.Â
//..
//..
// Generate a random index
var random = new Random();
int randomIndex = random.Next(userAgents.Count);
// Select a random UA using the randomIndex
string randomUserAgent = userAgents[randomIndex];
After that, set the selected UA using the DefaultRequestHeaders
class, make a GET
request and print the response (like in step 2). You should have the following complete code.Â
using System;
using System.Net.Http;
class Program
{
static async Task Main()
{
// Create an instance of HttpClient
HttpClient httpClient = new HttpClient();
// Define your User Agent List
List<string> userAgents = new List<string>
{
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
};
// Generate a random index
var random = new Random();
int randomIndex = random.Next(userAgents.Count);
// Select a random UA using the randomIndex
string randomUserAgent = userAgents[randomIndex];
// Set selected User Agent
httpClient.DefaultRequestHeaders.Add("User-Agent", randomUserAgent);
// Make a GET request to httpbin.io/user-agent
HttpResponseMessage response = await httpClient.GetAsync("https://httpbin.io/user-agent");
// Read and print the content
string content = await response.Content.ReadAsStringAsync();
Console.WriteLine(content);
}
}
Every time you run the script, a different UA will be used to make your request. For example, here are our results for three requests:Â
{
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
}
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
}
{
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"
}
Awesome! You've successfully rotated your User Agent in HttpClient.
Since you'll need to expand your list, it's important to pay attention to your UA construction. For example, if the User-Agent suggests a specific browser version that doesn't exist or is outdated, websites can easily detect the discrepancy or increase the level of alert and block your scraper.
That said, maintaining a diverse, well-formed, and up-to-date pool of User Agents can be challenging. But don't worry about that because the next section shows a way to address this difficulty.
Quick Solution for HttpClient User Agents at Scale to Avoid Getting Blocked
Websites use many techniques to detect web scrapers, so you can still get blocked even when rotating well-formed User Agents.
A commonly recommended quick fix is setting up a proxy in HttpClient for web scraping so that you can distribute your traffic across multiple IP addresses and ultimately avoid IP-based blocking. But again, this is often not enough.
The good news is you can complement HttpClient with ZenRows to overcome all challenges automatically. ZenRows is a web scraping API that provides all the functionalities you need to avoid detection, including UA rotation by default, residential proxies and many others.
Let's see how ZenRows works with HttpClient:
First of all, sign up to get your free API key, and you'll be directed to the Request Builder page.Â
Paste your target URL (https://www.g2.com/
), check the box for Premium Proxies
, and activate the JS Rendering
boost mode. Then, Select C# as the language you'll use to get your request code generated on the right.
You'll see that RestSharp is suggested, but you can use HttpClient. You only need to send a request to the ZenRows API. For that, first copy the ZenRows API URL from the generated request on the right.
Below is the ZenRows API URL, as seen in the screenshot.
https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2F&js_render=true&premium_proxy=true
Make a request to it using HttpClient. Your new script should look like this:
using System;
using System.Net.Http;
class Program
{
static async Task Main()
{
// Create an instance of HttpClient
HttpClient httpClient = new HttpClient();
// Define the ZenRows API URL
string zenRowsApiUrl = "https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2F&js_render=true&premium_proxy=true";
// Make a GET request to httpbin.io/user-agent
HttpResponseMessage response = await httpClient.GetAsync(zenRowsApiUrl);
// Read and print the content
string content = await response.Content.ReadAsStringAsync();
Console.WriteLine(content);
}
}
Run it, and you'll get G2's HTML.
<!DOCTYPE html>
//..
<title id="icon-label-55be01c8a779375d16cd458302375f4b">G2 - Business Software Reviews</title>
//..
<h1 ...id="main">Where you go for software.</h1>
Cool, right? ZenRows makes scraping any website easy.
Conclusion
Setting and randomizing your User Agent using HttpClient in C# can help you avoid detection, but you must ensure your UA is properly constructed to avoid raising red flags that can get you blocked.
At the same time, there are many reasons why you might get blocked, such as CAPTCHAs, user behavior analysis, browser fingerprinting, and so on, and they require more than properly crafted UAs. Because of that, you might find it useful to consider ZenRows, the complete toolkit to scrape without getting blocked.