If you've ever experimented with web scraping in C# using Selenium, you know how frustratingly easy it is for websites to detect your scraper. But what if you could use Undetected Chromedriver in C# to fly under the radar and avoid getting blocked?
While Undetected Chromedriver is typically a Python module, you'll learn how to use it in your C# script in this guide. We'll also discuss some common issues and how to solve them.
Why Use Undetected ChromeDriver in C#?
Undetected Chromedriver is an optimized version of the Selenium Chromedriver designed to avoid being flagged and blocked by anti-bot systems.
Websites using Selenium detection scripts typically test for variables containing words like webdriver
and document variables called $cdc_
and $wdc_
. Of course, all these are inherent properties of the standard Chromedriver, which makes it easy for websites to identify and flag your script as suspicious activity.ย
On the other hand, Undetected Chromedriver optimizes the standard Chromedriver by patching those properties, making it harder for web servers to detect any signs of automation.ย
How to use ChromeDriver in C#
This tutorial will guide you through using ChromeDriver in C# projects. It will cover both dedicated C# libraries and the integration of Python scripts to enhance your web automation tasks effectively.
Option 1: Dedicated C# Libraries
Dedicated C# libraries like Selenium.WebDriver.UndetectedChromeDriver and Selenium.UndetectedChromeDriver offers a direct solution. Both implementations of the original Undetected Chromedriver allow you to leverage the Python module's functionalities in your C# script without writing any Python code.ย
Here's how.
We'll use Selenium.UndetectedChromeDriver for this example, since it is the most popular one.
To get started, install the C# library using the command below.
PM> Install-Package Selenium.UndetectedChromeDriver
Note that this command is intended to be used with the package manager console in Visual Studio.ย Alternatively, you can install the package using .NET CLI, like in the following command.
dotnet add package Selenium.UndetectedChromeDriver
After that, include the library as a dependency and create a new Undetected Chromedriver instance.
using SeleniumUndetectedChromeDriver;
using (var driver = UndetectedChromeDriver.Create(driverExecutablePath: await new ChromeDriverInstaller().Auto()))
{
}
This sets up Selenium.UndetectedChromedriver
to automatically download the correct driver for your Chrome browser version. At the same time, the C# library patches all the properties necessary to avoid detection.ย
All that is left is to use the driver instance to navigate to the target website and retrieve your desired information. For this example, we'll scrape NowSecure, a Cloudflare-protected test website.ย
// Navigate to the website
driver.GoToUrl("https://nowsecure.nl");
// Get the HTML content
var htmlContent = driver.ExecuteScript("return document.documentElement.outerHTML;") as string;
// Print the HTML content
Console.WriteLine($"HTML Content:\n{htmlContent}");
}
Putting everything together, here's the complete code.
using SeleniumUndetectedChromeDriver;
class Program
{
static async Task Main()
{
using (var driver = UndetectedChromeDriver.Create(driverExecutablePath: await new ChromeDriverInstaller().Auto()))
{
// Navigate to the website
driver.GoToUrl("https://nowsecure.nl");
// Get the HTML content
var htmlContent = driver.ExecuteScript("return document.documentElement.outerHTML;") as string;
// Print the HTML content
Console.WriteLine($"HTML Content:\n{htmlContent}");
}
}
}
It's essential to note that these are unofficial C# ports and may not be as extensively tested as the Python module. As a result, you could encounter occasional limitations or dependencies that are not as robustly supported.
Option 2: Python and C# Integration Approach
Another way you can use Undetected Chromedriver in C# is by writing the automation in Python and calling the script from your C# code.ย
Since the tool is primarily developed for Python, this approach gives you access to its complete feature set, including updates provided originally by Undetected Chromedriver.
Also, Python has a rich web scraping and automation ecosystem with numerous libraries and tools. Leveraging Python code gives you the room to tap into this extensive ecosystem.ย
However, the need for inter-language communication between C# and Python adds a layer of complexity. Coordinating data exchanges and ensuring a smooth workflow between the two languages is crucial.
Let's see how to integrate Python into your C# project.
To get started, ensure you have Python set up on your machine. Then, install Undetected Chromedriver for Python using the following command.
pip install undetected-chromedriver
Next, create your Python script that'll use Undetected Chromedriver to access the target website. The approach is similar to using a dedicated C# library. Import the necessary dependencies, create an instance of Undetected Chromedriver, and navigate to the target website. This time, let's take a screenshot of the page.
import undetected_chromedriver as uc
# create an instance of the undetected ChromeDriver in headless mode
options = uc.ChromeOptions()
options.add_argument("--headless")
driver = uc.Chrome(options=options)
# navigate to target website
driver.get("https://www.nowsecure.nl")
# take a screenshot
driver.save_screenshot("screenshot.png")
print ("Screenshot taken")
# close the browser
driver.quit()
Save this script as a Python file, for example, script.py
.
Now, call the Python script in C#.ย
For that, you need the Process
class within the System.Diagnostics
namespace. This class provides a way to interact with external programs, such as Python scripts, within a C# application.ย
Start by importing the namespace and creating a process object.
using System.Diagnostics;
using (Process process = new Process())
{
// ...
}
Within the object, set the Python interpreter as the filename and specify the path to your Python script as arguments.
//..
{
process.StartInfo.FileName = "python";
process.StartInfo.Arguments = "C:\\Path\\to\\your\\script.py";
//..
}
Ensure that "python" is in the system's PATH or provide the full path to the Python executable.
Next, configure process setting.
//..
{
process.StartInfo.UseShellExecute = false; // Don't use the operating system shell
process.StartInfo.RedirectStandardOutput = true; // Redirect standard output
process.StartInfo.CreateNoWindow = true; // Don't create a window for the process
//..
}
These settings ensure that the C# program can capture the Python script's output.
Lastly, start the process, read the standard output, wait for the process to exit, and process the result.
//..
{
//..
// start the process
process.Start();
// read the standard output
string output = process.StandardOutput.ReadToEnd();
// wait for the process to exit
process.WaitForExit();
// process the output
Console.WriteLine(output);
}
Putting everything together, your complete C# code should look like this.
using System.Diagnostics;
class Program
{
static void Main()
{
// Create a Process object
using (Process process = new Process())
{
// Set the Python interpreter as the filename
process.StartInfo.FileName = "python";
// Specify the path to your Python script as arguments
process.StartInfo.Arguments = "C:\\Path\\to\\your\\script.py";
// Configure process settings
process.StartInfo.UseShellExecute = false; // Don't use the operating system shell
process.StartInfo.RedirectStandardOutput = true; // Redirect standard output
process.StartInfo.CreateNoWindow = true; // Don't create a window for the process
// Start the process
process.Start();
// Read the standard output
string output = process.StandardOutput.ReadToEnd();
// Wait for the process to exit
process.WaitForExit();
// Process the output
Console.WriteLine(output);
}
}
}
Run it, and your result should be your Python script's screenshot.ย
Troubleshooting Common Issues
While integrating Undetected Chromedriver into C# can enhance web scraping capabilities, you may encounter specific challenges, regardless of the chosen approach. Below are some of the common ones and possible solutions.
Dedicated C# library-specific issues
One common challenge with this approach is the potential incompatibility between C# library versions and browser updates. Since these libraries are unofficial ports of the Python module, they may not be actively maintained or updated to support the latest changes introduced in a browser update.
These incompatibilities will ultimately lead to script errors. So, ensure alignment between your installed browser version and the C# library. Before updating your browser to the latest version, verify its compatibility with the current C# library.ย
Inter-language integration issues
Integrating Python into C# scripts can introduce debugging and code maintenance challenges. These challenges are because traditional debugging tools and techniques may only handle some Python and C# languages seamlessly.
Also, managing errors that occur across language boundaries can be complex. When an error originates in one language and propagates to another, identifying the root cause and implementing effective error-handling mechanisms is crucial.
Consider integrated debugging tools that support Python and C# to address these challenges. Some modern IDEs offer comprehensive debugging capabilities for multi-language projects, enabling you to effectively trace through Python and C# code.
Additionally, ensure you adopt consistent error-handling strategies in Python and C# components. Establish clear communication protocols for error reporting between the two languages.
Anti-Detection Techniques with Undetected ChromeDriver in C#
To effectively use Undetected Chromedriver in C#, you can use anti-detection techniques to avoid getting blocked.
One option is to randomize User-Agent strings. You can change user agent strings in your scripts to prevent detection by anti-bot systems that flag known automation tools. This makes your script's web requests less predictable and more challenging to identify.
Another popular choice is to use proxy rotation. You can avoid IP tracking and bans by alternating IP addresses and locations. That should allow you to scrape more easily.
Read our guide to learn more techniques about avoiding bot detection with Selenium.
When Undetected ChromeDriver Isn't Enough
Undetected Chromedriver, combined with the abovementioned techniques, can still get blocked, particularly against advanced anti-bot solutions. Let's use a G2 product review page as a target URL to prove this. This page uses advanced Cloudflare protection to mitigate bot traffic.
# ...
driver.get("https://www.g2.com/products/asana/reviews")
We get the following result.
Here, we fail to bypass the advanced detection system implemented by G2.ย
This is where a web scraping API like ZenRows comes in handy because it handles all anti-bot bypass for you, regardless of its complexity.
As a demonstration, let's use ZenRows to scrape the same G2 product review page.
To get started, sign up for free, and you'll get to the request builder page.ย
Input the target URL (https://www.g2.com/products/asana/reviews), activate "Premium Proxies" and "JS Rendering". Also, select C#.
That'll generate your request code on the right. Copy it and use any HTTP client of your choice. The generated code uses RestSharp.
Your code should look like this.
// dotnet add package RestSharp
using RestSharp;
namespace TestApplication {
class Test {
static void Main(string[] args) {
var client = new RestClient("https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2Fproducts%2Fasana%2Freviews&js_render=true&premium_proxy=true");
var request = new RestRequest();
var response = client.Get(request);
Console.WriteLine(response.Content);
}
}
}
Run it, and you'll get the HTML content of the page.
<!DOCTYPE html>
#...
<title>Asana Reviews 2024: Details, Pricing, & Features | G2</title>
#...
Awesome! ZenRows makes it easy to scrape without getting blocked.
Conclusion
Using Undetected Chromedriver in C# comes with its unique set of challenges. This guide explored different approaches to using Undetected Chromedriver in C#, including the intricacies of integrating Python and C#.
Despite the robust capabilities of Undetected ChromeDriver, you may still get blocked by advanced anti-bot systems. So, consider ZenRows to guarantee a successful anti-bot bypass.