Are you considering a shift toward the .NET framework? Or are you on the hunt for a C# headless browser best suited to your project?
This article shortlists the five best headless browsers for web scraping. You’ll learn about their strengths and weaknesses based on factors such as popularity, ease of use, speed, and success rate of avoiding blocks and bans.
Let’s go!
What Is the Best Headless Browser for C#?
A good headless browser for web scraping in C# will:
- Let you smoothly navigate web forms.
- Supports your page interaction needs.
- Use JavaScript parsing.
Here’s a quick comparison table of the five best browsers:
Tool | Popularity | Ease of Use | Speed | Anti-Block Measures |
---|---|---|---|---|
ZenRows | Rapidly growing | Beginner-friendly and easy to set up | Fast | Bypasses any anti-bot system, regardless of complexity |
PuppeteerSharp | Large user base | Relatively easy to use, no additional setup required | Gets slow when running multiple instances in parallel | Easily detectable by websites’ automation properties |
Selenium | Large user base | Requires additional setup | Gets slow and resource-intensive | Gets blocked by advanced anti-bot systems |
Playwright | Rapidly growing | Does not require additional setup, but you need to download the necessary browsers | Can get slow when running multiple instances in parallel | Easily detectable by websites’ automation properties |
To test and showcase each tool’s functionalities, we’ve used the ScrapeMe page:
Each headless browser will attempt to retrieve the product's price.
Now, let's find the perfect tool for your needs.
1. ZenRows: More Than a Headless Browser
ZenRows’ headless browser functionality lets you render dynamic websites, execute AJAX calls, and interact with web page elements as with a regular browser.
In addition, ZenRows is an all-in-one web scraping API that equips you with everything you need to bypass anti-bot systems, from premium proxies and auto-rotating user agents to anti-CAPTCHAs.
ZenRows integrates easily with any development workflow and offers both proxy and API connection options.
And here’s the cherry on top: unlike most headless browsers, it doesn't come with additional infrastructure overhead.
👍 Pros:
- Offers both API and proxy connection options.
- Advanced anti-bot bypass features to scrape all web pages.
- Auto-rotating premium proxies.
- Easy to use and intuitive API.
- Extensive documentation and a rapidly growing developer community.
👎 Cons:
- Limited customization compared to its open-source counterparts.
⚙️ Key Features:
- Premium proxies
- Geolocation
- Custom headers
- JavaScript rendering
- Page interaction
- Block resources
- CSS selectors
- Auto-parsing
- Concurrency
👨💻 Example:
To extract the product's price from the target web page using ZenRows, you'll need an API key. Sign up to get yours.
In your dashboard, paste your target URL (https://scrapeme.live/shop/Pikachu/
), check the box for Premium Proxies
, and activate the JavaScript Rendering
boost mode.
Next, inspect the target page in your browser to identify the location of the product's price.
The desired data is located in a span
with the amount
class. Right-click on the span
tag and copy its selector.
Then, head back to your dashboard. At the bottom left, select the CSS Selectors
output option, and replace the placeholder code with the selector you copied.
Your output configuration will look like this:
Lastly, select C# as the language, and you’ll get a script ready to try:
using RestSharp;
namespace TestApplication {
class Test {
static void Main(string[] args) {
var client = new RestClient("https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fscrapeme.live%2Fshop%2FPikachu%2F&js_render=true&css_extractor=%257B%2522product-price%2522%253A%2522%2523product-752%2520%253E%2520div.summary.entry-summary%2520%253E%2520p.price%2520%253E%2520span%2522%257D");
var request = new RestRequest();
var response = client.Get(request);
Console.WriteLine(response.Content);
}
}
}
Run it, and you'll get the product's price.
{
"product-price": "£37.00"
}
That’s how easy it is to scrape with ZenRows!
2. PuppeteerSharp: Chrome DevTools Mastery
PuppeteerSharp is a .NET port of the popular NodeJS API, Puppeteer. The tool extends the capabilities of headless Chrome to the C# ecosystem, enabling you to control Chrome/Chromium over the DevTools protocol using C#.
PuppeteerSharp can perform almost any action that you can manually execute in a browser. This includes:
- mouse movements,
- filling forms and automating their submissions,
- generating screenshots,
- rendering dynamic web pages,
- scraping single-page applications (SPA).
What sets PuppeteerSharp apart is its direct interaction with the Chrome DevTools protocol, which results in faster command execution.
Unfortunately, PuppeteerSharp uses pre-shipped Chromium binaries, which are easily detectable by websites’ protection systems. Combined with its automation properties, it puts your web scraper at a high risk of getting blocked.
👍 Pros:
- Interacts directly with the Chrome DevTools protocol.
- Can run multiple instances in parallel.
- Maintained by Google.
- Easy to use.
- Open source.
- Doesn’t require additional setup.
- Enjoys a growing developer community.
- Captures screenshots and generates PDFs.
👎 Cons:
- Easy to detect.
- Limited browser support (Chrome/Chromium)
- Can get slow and resource-intensive, especially when running multiple instances in parallel.
⚙️ Features:
- Page interaction
- Block resources
- Screenshot and PDF generation
- Headless Chrome integration
- Automates form submission, UI testing, keyboard input, etc.
👨💻 Example: The example below shows how to scrape with C# using PuppeteerSharp. The headless browser:
- Downloads and launches a headless browser instance.
- Navigates to a target website
- Extracts the product price using JavaScript evaluation.
- Prints the extracted price before closing the browser.
// import the required package
using PuppeteerSharp;
class Program
{
static async Task Main(string[] args)
{
using var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync();
// launch a new headless browser instance
await using var browser = await Puppeteer.LaunchAsync(
new LaunchOptions { Headless = true });
// open a new browser page
await using var page = await browser.NewPageAsync();
// navigate to target website
await page.GoToAsync("https://scrapeme.live/shop/Pikachu/");
// extract product price
var productPrice = await page.EvaluateExpressionAsync<string>(
"document.querySelector('#product-752 > div.summary.entry-summary > p.price > span').innerText");
// print the product price
Console.WriteLine("Product Price: " + productPrice);
// close the browser when done
await browser.CloseAsync();
}
}
3. Selenium: The Web Automation Pioneer
Selenium has been around for over a decade and remains one of the most popular C# headless browsers.
It was originally developed to automate different browsers (Safari, Chrome, Firefox, Edge, and Internet Explorer) for testing purposes. However, its ability to simulate real user interactions extends its functionality to web scraping and any browser-driven tasks.
Selenium’s top advantage is the support for built-in features, such as Selenium IDE and Selenium Manager, which use the same syntax for all browsers via the WebDriver interface.
Selenium is open source, so it attracted a community of developers actively contributing to its maintenance and continuous development. The open-sourness helps avoid licensing costs, but the library gets expensive when scaling up since it’s resource-intensive and requires additional infrastructure overhead.
👍 Pros:
- Supports multiple browsers.
- Can run multiple instances in parallel.
- Easy identification of web elements using selectors.
- Supports multiple programming languages.
- Extensive documentation and large developer community.
- Multiple device testing.
- Continuous integration tools.
- Captures screenshots and generates PDFs.
👎 Cons:
- Requires WebDriver configuration and setup for specific browsers.
- Can get slow and resource-intensive, especially when running multiple instances in parallel.
- Its automation properties are easily detectable.
⚙️ Features:
- Page interaction
- Block resources
- Proxy support
- Multi-language compatibility
- Playback and record feature using the Selenium IDE
- Integrates with frameworks like Ant and Maven.
- Supports multiple operating systems.
- Automates form submission, UI testing, keyboard input, etc.
👨💻 Example: Let’s see a sample code snippet showing how to extract data using Selenium C#. The Selenium code:
- Configures ChromeOptions for headless mode.
- Sets up a ChromeDriver.
- Navigates to a specified URL.
- Locates the element containing the product price.
- Extracts the text of the product price
- Prints it.
// import the required libraries
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
public class Example
{
static void Main(string[] args)
{
// set ChromeOptions to run in headless mode
ChromeOptions options = new ChromeOptions();
options.AddArgument("--headless");
// set up ChromeDriver
IWebDriver driver = new ChromeDriver(options);
// navigate to the target url
string target_url = "https://scrapeme.live/shop/Pikachu/";
driver.Navigate().GoToUrl(target_url);
// find the element containing the product price
IWebElement priceElement = driver.FindElement(By.CssSelector("#product-752 > div.summary.entry-summary > p.price > span"));
// extract the text of the product price
string productPrice = priceElement.Text;
// print the extracted product price
Console.WriteLine("Product Price: " + productPrice);
// close the browser
driver.Quit();
}
}
4. Playwright: One API, Limitless Possibilities
Playwright is an open-source library for automating Chrome, Firefox, and Webkit using a single API.
With Playwright, you can perform all the basic browser tasks, such as navigating web pages, interacting with elements, filling out forms, and capturing screenshots. The library also supports more advanced features, such as intercepting network requests and automatically waiting for elements to be available before performing actions.
Playwright's browser context allows you to create multiple scrapers in isolation, simulating a new browser profile with no additional overhead. Additionally, you can persist state within each browser context. For example, you can log in once, save the authentication state, and reuse it for all tasks.
👍 Pros:
- Supports multiple browsers.
- Automatically waits for elements before performing actions.
- Smart assertions that retry until the necessary conditions are met.
- Can run multiple instances in parallel.
- Captures videos and screenshots.
- Page interaction.
- HTML manipulation.
- Extensive documentation and growing developer community.
👎 Cons:
- Can only handle HTTP/HTTPS and browser-specific protocols.
- Limited support for browser extensions.
- The community is not as large as its open-source counterparts (Selenium and Puppeteer).
- Can be easily detected by anti-bot measures.
⚙️ Features:
- Cross-browser
- Cross-platform
- Network interception
- State persistence for faster execution
- Page interaction
- Block resources
- Proxy support
- Support for multiple operating systems
- Automatic waits
👨💻 Example: Here’s an example of obtaining the target web pages’ HTML and extracting the product's price using Playwright C#. Utilizing Microsoft Playwright to automate a headless Chromium browser, the code below:
- Creates a new page
- Navigates to the specified URL.
- Extracts the product price using a CSS selector.
using Microsoft.Playwright;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
// launch Playwright
using var playwright = await Playwright.CreateAsync();
// launch Chromium browser
await using var browser = await playwright.Chromium.LaunchAsync(new BrowserTypeLaunchOptions { Headless = true });
// create a new page
var page = await browser.NewPageAsync();
// navigate to the scrapeme website
await page.GotoAsync("https://scrapeme.live/shop/Pikachu/");
// extract the product price
var priceElement = await page.QuerySelectorAsync("#product-752 > div.summary.entry-summary > p.price > span");
var productPrice = await priceElement.InnerTextAsync();
// print the product price
Console.WriteLine($"Product Price: {productPrice}");
// close the browser
await browser.CloseAsync();
}
}
Which Headless Browser Is the Fastest? Test and Benchmarks
Let's explore the speed and efficiency of each C# headless browser above to determine the fastest. The test scenario for this benchmark remains the same as for all the code snippets presented above. Each tool navigates to ScrapeMe, retrieves its HTML, and extracts the product's price.
Here are the speed test results:
Tool | Time Taken (seconds) |
---|---|
ZenRows | 3.056 |
PuppeteerSharp | 9.153 |
Selenium | 9.216 |
Playwright | 10.798 |
As anticipated, ZenRows proves its reliability as the fastest performer, completing the task in 3 seconds. Selenium and PuppeteerSharp share the distant second position with a speed of just above 9 seconds. Meanwhile, Playwright recorded the slowest time at approximately 11 seconds.
To benchmark these libraries, we used BenchMarkDotNet, a project that transforms methods into benchmarks.
The measurements were made on an AMD Ryzen 9 6900HX with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores, and .NET SDK 8.0.100-rc.2.23502.2. However, the performance is expected to be similar on any machine configuration.
Conclusion
C# headless browsers let you scrape dynamic web pages or single-page applications. In this article, you’ve seen five examples of headless browsers ready to use with C#: ZenRows, PuppeteerSharp, Selenium, Playwright, and X.
While all the presented libraries have advantages for web scraping, it’s important to remember that headless browsers are prone to detection by websites’ anti-bot systems. To mitigate the challenge of getting blocked, you should use a complete web scraping solution, such as ZenRows.
In addition to full anti-block protection, ZenRows offers features useful for web scraping, such as full CAPTCHA bypass, proxy rotator, or User Agent Rotator. Take ZenRows for a spin with a free trial today!