Ever faced the frustration of getting blocked while web scraping? You're not alone. Websites often implement security measures that can deny your web scraper access. However, the good news is you can navigate these obstacles by routing your request through a Selenium proxy.Â
In this tutorial, you'll learn how to configure a Selenium C# proxy. So, let's dive in.
How to Use a Proxy in Selenium C#?
There are two common approaches you can employ to use a proxy in Selenium C#. One is using the AddArgument()
method to specify your proxy details within the browser options.
The second approach involves defining your proxy settings using a proxy object before assigning it to the Webdriver options.
Before we dive into the step-by-step, here's a basic Selenium script to which you can add proxy configurations.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
class Program
{
static void Main()
{
// Set up the ChromeDriver instance
IWebDriver driver = new ChromeDriver();
// Navigate to target website
driver.Navigate().GoToUrl("https://ident.me");
// Add a wait for three seconds
Thread.Sleep(3000);
// Select the HTML body
IWebElement pageElement = driver.FindElement(By.TagName("body"));
// Get and print the text content of the page
string pageContent = pageElement.Text;
Console.WriteLine(pageContent);
// Close the browser
driver.Quit();
}
}
This code creates a Chromedriver instance, navigates to ident, a website that displays the web client's IP address as HTML content and prints the page content.
If you'd like a web scraping refresher, check out our C# web scraping guide.
Step 1: Use a Proxy in an HTTP Request
Start by creating a new ChromeOptions instance. Then, using options.AddArgument
, specify your proxy details within the browser options. Remark: We grabbed a free proxy from FreeProxyList.
ChromeOptions options = new ChromeOptions();
// Set up the ChromeDriver instance with proxy configuration using AddArgument
options.AddArgument("--proxy-server=http://71.86.129.131:8080");
To verify it works, let's add the proxy configuration above to the basic script we created earlier. You'll have the following complete code.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
class Program
{
static void Main()
{
ChromeOptions options = new ChromeOptions();
// Set up the ChromeDriver instance with proxy configuration using AddArgument
options.AddArgument("--proxy-server=http://71.86.129.131:8080");
// Set up the ChromeDriver instance
IWebDriver driver = new ChromeDriver(options);
// Navigate to target website
driver.Navigate().GoToUrl("http://ident.me");
// Add a wait for three seconds
Thread.Sleep(3000);
// Select the HTML body
IWebElement pageElement = driver.FindElement(By.TagName("body"));
// Get and print the text content of the page
string pageContent = pageElement.Text;
Console.WriteLine(pageContent);
// Close the browser
driver.Quit();
}
}
Run it, and your response should be your proxy's IP address.
71.86.129.131
Awesome, you've configured your first Selenium C# proxy.
However, while we used a free proxy in the example above, they're generally unreliable. In real-world use cases, you'll need premium proxies, which often require additional configuration. Let's see how to implement such proxies in Selenium C#.
Proxy Authentication with Selenium C#
Premium proxy providers often require credentials like username and password for security and access control.Â
Unfortunately, Selenium does not provide built-in authentication support. However, it works with BiDi APIs to provide the NetworkAuthenticationHandler
class that allows you to supply authentication information for network requests.Â
This class has two properties; Credentials
and UriMatcher
. Setting the Credentials
property allows you to provide the necessary username and password, while the UriMatcher property specifies the conditions under which these credentials should be used for authentication.
Therefore, to authenticate your Selenium C# proxy, set up the handler with your proxy credentials and add it to the network request using the AddAuthenticationHandler()
method.
// Create the NetworkAuthenticationHandler with credentials
var networkAuthenticationHandler = new NetworkAuthenticationHandler
{
UriMatcher = uri => uri.Host.Contains("ident.me"), // only apply for the specific host
Credentials = new NetworkCredential("<YOUR_USERNAME>", "<YOUR_PASSWORD>")
};
// Add the authentication credentials to the network request
var networkInterceptor = driver.Manage().Network;
networkInterceptor.AddAuthenticationHandler(networkAuthenticationHandler);
So, if the proxy in step 2 were premium, you can authenticate it by updating the full code with the above code snippet. Your new code should now look like this.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
class Program
{
static void Main()
{
ChromeOptions options = new ChromeOptions();
// Set up the ChromeDriver instance with proxy configuration using AddArgument
options.AddArgument("--proxy-server=http://71.86.129.131:8080");
// Set up the ChromeDriver instance
IWebDriver driver = new ChromeDriver(options);
// Create the NetworkAuthenticationHandler with credentials
var networkAuthenticationHandler = new NetworkAuthenticationHandler
{
UriMatcher = uri => uri.Host.Contains("ident.me"), // Only apply for the specific host
Credentials = new PasswordCredentials("<YOUR_USERNAME>", "<YOUR_PASSWORD>")
};
// Add the authentication credentials to the network request
var networkInterceptor = driver.Manage().Network;
networkInterceptor.AddAuthenticationHandler(networkAuthenticationHandler);
// Navigate to target website
driver.Navigate().GoToUrl("http://ident.me");
// Add a wait for three seconds
Thread.Sleep(3000);
// Select the HTML body
IWebElement pageElement = driver.FindElement(By.TagName("body"));
// Get and print the text content of the page
string pageContent = pageElement.Text;
Console.WriteLine(pageContent);
// Close the browser
driver.Quit();
}
}
Step 2: Implement a Rotating Proxy in Selenium C#
Rotating proxies is vital when making numerous requests to a target server. Websites often flag such automated requests as suspicious activity. However, by periodically changing IP addresses, you distribute traffic across multiple IPs, and your requests appear to come from different users.Â
To build a C# proxy rotator in Selenium, first, you need a pool of proxies to choose from for each request. We've grabbed a few from a FreeProxyList.
Start by defining your proxy pool.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
class Program
{
static void Main()
{
var proxies = new List<string>
{
"http://211.193.1.11:80",
"http://138.68.60.8:8080",
"http://209.13.186.20:80"
// Add more proxy configurations as needed
};
}
}
Next, select a random proxy, create a new ChromeOptions
instance, and assign the selected proxy to the browser options using the AddArguments
method.
//..
static void Main()
{
//..
// Select a random proxy configuration
var random = new Random();
int randomIndex = random.Next(proxies.Count);
string randomProxy = proxies[randomIndex];
// Create a new ChromeOptions instance
ChromeOptions options = new ChromeOptions();
// Assign proxy to chrome instance using AddArgument
options.AddArgument($"--proxy-server={randomProxy}");
options.AddArgument("headless");
}
Lastly, implement your scraping logic like in the basic script we created earlier. Putting everything together, you should have the following complete code.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using System;
class Program
{
static void Main()
{
var proxies = new List<string>
{
"http://211.193.1.11:80",
"http://138.68.60.8:8080",
"http://209.13.186.20:80"
// Add more proxy configurations as needed
};
// Select a random proxy configuration
var random = new Random();
int randomIndex = random.Next(proxies.Count);
string randomProxy = proxies[randomIndex];
// Create a new ChromeOptions instance
ChromeOptions options = new ChromeOptions();
// Assign proxy to chrome instance using AddArgument
options.AddArgument($"--proxy-server={randomProxy}");
options.AddArgument("headless");
// Set up the ChromeDriver instance
IWebDriver driver = new ChromeDriver(options);
// Navigate to target website
driver.Navigate().GoToUrl("http://ident.me");
// Add a wait for three seconds
Thread.Sleep(3000);
// Select the HTML body
IWebElement pageElement = driver.FindElement(By.TagName("body"));
// Get and print the text content of the page
string pageContent = pageElement.Text;
Console.WriteLine(pageContent);
// Close the browser
driver.Quit();
}
}
To verify it works, make multiple requests. You should get a different IP address per request. Here are the results for two requests.
211.193.1.11
//..
138.68.60.8
### ```
Awesome! You've built your first Selenium C# proxy rotator.
Now, let's try your proxy rotator in a real-world scenario against a protected website, a G2 Product review page.
For that, replace the target URL in step 2 with https://www.g2.com/products/salesforce-salesforce-sales-cloud/reviews
. Run your code, and it'll fail, displaying an error message like the one below.
<!DOCTYPE html>\n
<!--[if lt IE 7]>
</head>\n <body>\n <div.......">\n
<h1 data-translate="block_headline">Sorry, you have been blocked</h1>
<h2 class="cf-subheadline">
<span data-translate="unable_to_access">You are unable to access</span> g2.com/...
</h2>
#....
This is because anti-bot systems easily detect free proxies. We only used them in the above examples to explain the basics. For better results, you need premium proxies. Let's explore those next.
Step 3: Get a Residential Proxy to Avoid Getting Blocked
Using free proxies can be tempting as they're readily accessible. However, In the previous section, we saw that free proxies fail in real-life use cases. So, you'll do better with premium proxies.Â
Check out our list of the best web scraping proxies that integrate with Selenium in C#.Â
However, web scraping with Selenium does come with some risks as it can be detected by anti-bot solutions. This is because Selenium has inherent properties that websites recognize as bot-like rather than human. And although different configurations and tools, like proxies, can help mask some of these properties, you'll still get blocked.Â
Luckily, ZenRows offers a better alternative. ZenRows is a web scraping API that provides the complete toolkit to scrape without getting blocked, including residential proxy rotator, user agent rotation, anti-Captcha, and more.
To top it all off, ZenRows is much easier to scale, has no infrastructure headaches, and costs are cheaper and flexible as you only pay for successful requests.Â
Let's see ZenRows against the same protected website we tried to scrape earlier.
To get started, sign up for free, and you'll be directed to the Request Builder page.
Paste your target URL, check the box for Premium Proxies
to rotate residential proxies automatically. Select the JS Rendering
boost mode. Then select C# as the language that'll generate your request code on the right.
Although RestSharp is suggested, you can use any C# HTTP client you choose. You only need to make your requests to the ZenRows API.Â
Copy the generated code to your favorite editor. Your new script should look like this.
using RestSharp;
namespace TestApplication {
class Test {
static void Main(string[] args) {
var client = new RestClient("https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2Fproducts%2Fsalesforce-salesforce-sales-cloud%2Freviews&js_render=true&premium_proxy=true");
var request = new RestRequest();
var response = client.Get(request);
Console.WriteLine(response.Content);
}
}
}
Run it, and you'll get the page's HTML content.
<!DOCTYPE html>
//..
<title id="icon-label-55be01c8a779375d16cd458302375f4b">G2 - Business Software Reviews</title>
//..
<h1 ...id="main">Where you go for software.</h1>
Awesome right? It's that easy to scrape using ZenRows.
Conclusion
Setting a Selenium C# proxy enables you to route your request through a different IP address. However, too many requests to a target website can result in an IP ban. So, you must rotate proxies for better results.
Instead of wrestling with Selenium and the tedious technicalities, consider ZenRows. Our web scraping API handles everything you need to extract data at scale without blocks. Sign up now to try ZenRows for free.