Web Scraping in C#: Complete Guide

Updated: February 14, 2025 · 14 min read

Table of contents

Prerequisites
- Set Up the Environment
- Initialize a C# Project
How to Scrape a Website in C#
- Step 1: Install Html Agility Pack and Its CSS Selector Extension
- Step 2: Load the Target Web Page
- Step 3: Inspecting the Target Page
- Step 4: Extract Data From HTML Elements
- Step 5: Export the Scraped Data to CSV
- Step 6: Launch the Scraper
Advanced Web Scraping in C#
- Web Crawling in .NET
- Avoid Being Blocked
- Scraping a Dynamic-Content Website with a Headless Browser in C#
Other Web Scraping Libraries in C#
Conclusion

More and more companies take advantage of data extracted from the web nowadays, and one of the most suitable programming languages for this purpose is C#. In this step-by-step tutorial, you'll see how to do web scraping in C# using libraries like Selenium and Html Agility Pack.

Let's get started!

Prerequisites

Set Up the Environment

Here are the prerequisites you need to meet to follow this C# scraping guide:

.NET 8+: The most recent version of the .NET SDK will do. At the time of writing, this is 8.0.205.
An IDE for coding in C#: Visual Studio 2022 Community Edition is a complete solution. If you prefer a lighter option, Visual Studio Code with the C# extension is perfect.

To save time, you can directly install the .NET Coding Pack. It includes Visual Studio Code with the essential .NET extensions and the .NET SDK. Otherwise, follow the links above to download the required tools.

You should now be all set to follow our web scraping C# tutorial now.

However, let's first verify that you installed .NET correctly. Launch a PowerShell window, and run the command below.

                    Terminal
                
dotnet --list-sdks

Copied!

This should print the version of the.NET SDK installed on your machine.

                    Output
                
8.0.205 [C:\Program Files\dotnet\sdk]

Copied!

If you receive a 'dotnet' is not recognized as an internal or external command error, then something went wrong. Restart your machine and try again. If the command above returns the same error, you'll need to reinstall .NET.

Initialize a C# Project

Let's create a .NET console application in Visual Studio Code. In case of problems, consult the official guide.

First, create an empty folder called SimpleWebScraper for your C# project.

                    Terminal
                
mkdir SimpleWebScraper

Copied!

Now, launch Visual Studio Code and select "File > Open Folder..." from the top menu.

VS Code open folder — Click to open the image in full screen

Select SimpleWebScraper and wait for Visual Studio Code to open the folder. Then, reach the Terminal window by selecting "View > Terminal" from the main menu.

VS Code terminal launcher — Click to open the image in full screen

In the Visual Studio Code terminal, launch the following command:

                    Terminal
                
dotnet new console --framework net8.0

Copied!

This will initialize a .NET 7.0 console project. Specifically, it will create a .csproj project file and a Program.cs C# file.

Now, replace the content of Program.cs with the code below.

                    program.cs
                
namespace SimpleWebScraper 
{ 
	class Program 
	{ 
		static void Main(string[] args) 
		{ 
			Console.WriteLine("Hello, World!"); 
 
			// scraping logic... 
		} 
	} 
}

  
  

  
Copied!

This is what a simple console script looks like in C#. Note that the Main() function will contain the C# data scraping logic.

Run the script by launching the command you see next:

                    Terminal
                
dotnet run

Copied!

Which should print:

                    Output
                
"Hello, World!"

Copied!

Great, your initial C# script works as expected!

You're about to learn the basics of web scraping in C#.

Frustrated that your web scrapers are blocked once and again?

ZenRows API handles rotating proxies and headless browsers for you.

Try for FREE

How to Scrape a Website in C#

We'll learn how to build a data scraper with C# by extracting data from ScrapingCourse.com, a demo site dedicated to web scrapers with real e-commerce features. The C# spider will automatically visit and extract the product data from every one of them.

This is what the target website looks like:

ScrapingCourse.com Ecommerce homepage — Click to open the image in full screen

Let's install some dependencies and start scraping data from the web.

Step 1: Install Html Agility Pack and Its CSS Selector Extension

Html Agility Pack (HAP) is a powerful open-source .NET library for parsing HTML documents. It offers a flexible API for web scraping, allowing you to download an HTML page and parse it. You can also select HTML elements and extract data from them.

Install Html Agility Pack through the NuGet HtmlAgilityPack package.

                    Terminal
                
dotnet add package HtmlAgilityPack

Copied!

Although Html Agility Pack natively supports XPath and XSLT, these aren't the most popular approaches when it comes to selecting HTML elements from the DOM. Fortunately, there's the HtmlAgilityPack CSS Selector extension.

Install it via the NuGet HtmlAgilityPack.CssSelectors library.

                    Terminal
                
dotnet add package HtmlAgilityPack.CssSelectors

Copied!

HAP will now be able to understand CSS Selector via extended methods.

Now, import Html Agility Pack in your C# web spider by adding the following line on top of your Program.cs file.

                    program.cs
                
using HtmlAgilityPack;

Copied!

If Visual Studio Code doesn't report errors, then you're good to go.

Time to see how to use HAP for web scraping in C#!

Step 2: Load the Target Web Page

Start by initializing an Html Agility Pack object.

                    program.cs
                
var web = new HtmlWeb();

Copied!

HtmlWeb gives you access to the web scraping capabilities offered by HAP.

Then, use HtmlWeb's Load() method to get the HTML from a URL.

                    program.cs
                
// loading the target web page 
var document = web.Load("https://www.scrapingcourse.com/ecommerce/");

Copied!

Behind the scene, HAP performs an HTTP GET request to download the web page and parse its HTML content. It raises an HtmlAgilityPack.HtmlWebException in case of error, and provides an HAP HtmlDocument object if everything works as expected.

You're now ready to use HtmlDocument to extract data from HTML elements. But first, let's study the code of the target page to define an effective strategy for selecting HTML elements.

Step 3: Inspecting the Target Page

Explore the target web page to see how it's structured. We'll start with the target HTML nodes, which are the product elements. Right-click on one and access the browser DevTools by selecting the "Inspect" option:

scrapingcourse ecommerce homepage inspect first product li — Click to open the image in full screen

Here, you can clearly see a single li.product HTML consists of the following four elements:

The product URL in an a.
The product image in an img.
The product name in an h2.
The product price in a .price span HTML element.

Inspect other HTML products, and you'll see they all share the same structure. What changes are the values stored in the underlying HTML elements. This means that you can scrape them all programmatically.

Next, we'll learn how to scrape data from these product HTML elements with HAP in C#.

Step 4: Extract Data From HTML Elements

You need to define a custom C# class to help you store the scraped data. For this purpose, initialize a nested Product class inside the Program as follows:

                    program.cs
                
public class Product 
{ 
	public string? Url { get; set; } 
	public string? Image { get; set; } 
	public string? Name { get; set; } 
	public string? Price { get; set; } 
}

  
  

  
Copied!

This custom class contains the Url, Image, Name, and Price fields. These match what you're interested in scraping from every product.

Now, initialize a list of Product in your Main() function with the line below:

                    program.cs
                
var products = new List<Product>();

Copied!

This will contain the scraped data stored in Product instances.

It's time to use HAP to extract the list of all li.product HTML elements from the DOM, like this:

                    program.cs
                
// selecting all HTML product elements from the current page 
var productHTMLElements = document.DocumentNode.QuerySelectorAll("li.product");

Copied!

QuerySelectorAll() allows you to retrieve HTML nodes from the DOM with a CSS selector. Here, the method applies the li.product CSS selector strategy to get all product elements. Specifically, QuerySelectorAll() returns a list of HAP HtmlNode objects.

Note that QuerySelectorAll() comes from the HAP CSS selector extension, so you won't find it in Html Agility Pack's original interface.

Use a foreach loop to iterate over the list of HTML and scrape data from each product.

                    program.cs
                
// iterating over the list of product elements 
foreach (var productHTMLElement in productHTMLElements) 
{ 
	// scraping the interesting data from the current HTML element 
	var url = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("a").Attributes["href"].Value); 
	var image = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("img").Attributes["src"].Value); 
	var name = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("h2").InnerText); 
	var price = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector(".price").InnerText); 
	// instancing a new Product object 
	var product = new Product() { Url = url, Image = image, Name = name, Price = price }; 
	// adding the object containing the scraped data to the list 
	products.Add(product); 
}

  
  

  
Copied!

Incredible! You just implemented C# web scraping logic!

The QuerySelector() method applies a CSS selector in the HtmlNode child nodes to get just one.

Then, we select an HTML attribute from Attributes and extract its date with Value. Wrap each value with HtmlEntity.DeEntitize() to replace known HTML entities.

Again, note that QuerySelector() comes from the Html Agility Pack CSS Selector extension. You won't find that method in vanilla HAP.

Awesome! Time to learn how to export the scraped data in an easy-to-read format, such as CSV.

Step 5: Export the Scraped Data to CSV

You can convert scraped data to CSV with native C# functions, but a library will make it easier.

CsvHelper is a fast, flexible, and reliable .NET library for reading and writing CSV files.

Install it by adding the NuGet CsvHelper package to your project's dependencies with:

                    program.cs
                
dotnet add package CsvHelper

Copied!

Import it into your project by adding this line to the top of your Program.cs file:

                    program.cs
                
using CsvHelper;

Copied!

Convert the scraped data to a CSV output file with CsvHelper as below:

                    program.cs
                
// initializing the CSV output file 
using (var writer = new StreamWriter("products.csv")) 
// initializing the CSV writer 
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) 
{ 
	// populating the CSV file 
	csv.WriteRecords(products); 
}

  
  

  
Copied!

The snippet here initializes a products.csv file. Then, CsvHelper's WriteRecords() writes all the product records to that CSV file. Thanks to the C# using statement, the script will automatically free the resources associated with the writing objects.

Note that the constructor requires a CultureInfo parameter. This defines the formatting specs and the delimiter and line-ending character to use. InvariantCulture ensures that any software can parse the produced CSV regardless of the user's local settings.

To use CultureInfo values, you need the following extra import:

                    program.cs
                
using System.Globalization;

Copied!

Fantastic! All that remains is to launch the C# web scraper!

Step 6: Launch the Scraper

This is what the Program.cs C# data scraper implemented so far looks like this:

                    program.cs
                
using HtmlAgilityPack; 
using CsvHelper; 
using System.Globalization; 
 
namespace SimpleWebScraper 
{ 
	public class Program 
	{ 
		// defining a custom class to store the scraped data 
		public class Product 
		{ 
			public string? Url { get; set; } 
			public string? Image { get; set; } 
			public string? Name { get; set; } 
			public string? Price { get; set; } 
		} 
 
		public static void Main() 
		{ 
			// creating the list that will keep the scraped data 
			var products = new List<Product>(); 
 
			// creating the HAP object 
			var web = new HtmlWeb(); 
 
			// visiting the target web page 
			var document = web.Load("https://www.scrapingcourse.com/ecommerce/"); 
 
			// getting the list of HTML product nodes 
			var productHTMLElements = document.DocumentNode.QuerySelectorAll("li.product"); 
			// iterating over the list of product HTML elements 
			foreach (var productHTMLElement in productHTMLElements) 
			{ 
				// scraping logic 
				var url = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("a").Attributes["href"].Value); 
				var image = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("img").Attributes["src"].Value); 
				var name = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("h2").InnerText); 
				var price = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector(".price").InnerText); 
 
				var product = new Product() { Url = url, Image = image, Name = name, Price = price }; 
				products.Add(product); 
			} 
 
			// crating the CSV output file 
			using (var writer = new StreamWriter("products.csv")) 
			using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) 
			{ 
				// populating the CSV file 
				csv.WriteRecords(products); 
			} 
		} 
	} 
}

  
  

  
Copied!

Run the script with the command below:

                    Terminal
                
dotnet run

Copied!

It might take a while to complete depending on the response time of the target page's server. When it's done, you'll find a products.csv file in the root folder of your C# project. Open it to explore the data below:

scrapingcourse ecommerce product output csv — Click to open the image in full screen

Wow! In 50 lines of code, you built a fully functional C# data scraper!

Advanced Web Scraping in C#

Web scraping in C# is much more than the fundamentals you just saw. Now, you'll learn about more advanced techniques to help you become a C# scraping expert!

Web Crawling in .NET

Don't forget that SrapingCourse.com shows a paginated list of products. To scrape all products, you need to visit the whole website, which is what web crawling is about.

To do web crawling in C#, you must follow all pagination links. Let's retrieve them all!

Inspect the pagination HTML element to understand how to extract the pages' URLs. Right-click on the number and select "Inspect":

scrapingcourse ecommerce homepage inspect — Click to open the image in full screen

You should be able to see something like this in the browser DevTools:

scrapingcourse ecommerce homepage devtools — Click to open the image in full screen

Here, note that all pagination HTML elements share the page-numbers CSS class. In detail, only HTML nodes involve a URL, while the span elements are placeholders. So, you can select all pagination elements with the a.page-numbers CSS selector.

To avoid scraping a page twice, you'll need a couple of extra data structures:

pagesDiscovered: A List to keep track of the URLs discovered by the crawler.
pagesToScrape: A Queue containing the list of pages the spider will scrape soon.

Also, a limit variable will prevent the C# spider from crawling pages forever.

                    program.cs
                
// the URL of the first pagination web page 
var firstPageToScrape = "https://www.scrapingcourse.com/ecommerce/page/1/"; 
 
// the list of pages discovered during the crawling task 
var pagesDiscovered = new List<string> { firstPageToScrape }; 
 
// the list of pages that remains to be scraped 
var pagesToScrape = new Queue<string>(); 
 
// initializing the list with firstPageToScrape 
pagesToScrape.Enqueue(firstPageToScrape); 
 
// current crawling iteration 
int i = 1; 
 
// the maximum number of pages to scrape before stopping 
int limit = 12; 
 
// until there are no pages to scrape or limit is hit 
while (pagesToScrape.Count != 0 && i < limit) 
{ 
	// extracting the current page to scrape from the queue 
	var currentPage = pagesToScrape.Dequeue(); 
 
	// loading the page 
	var currentDocument = web.Load(currentPage); 
 
	// selecting the list of pagination HTML elements 
	var paginationHTMLElements = currentDocument.DocumentNode.QuerySelectorAll("a.page-numbers"); 
 
	// to avoid visiting a page twice 
	foreach (var paginationHTMLElement in paginationHTMLElements) 
	{ 
		// extracting the current pagination URL 
		var newPaginationLink = paginationHTMLElement.Attributes["href"].Value; 
 
		// if the page discovered is new 
		if (!pagesDiscovered.Contains(newPaginationLink)) 
		{ 
			// if the page discovered needs to be scraped 
			if (!pagesToScrape.Contains(newPaginationLink)) 
			{ 
				pagesToScrape.Enqueue(newPaginationLink); 
			} 
			pagesDiscovered.Add(newPaginationLink); 
		} 
	} 
 
	// scraping logic... 
	 
	// incrementing the crawling counter 
	i++; 
}

  
  

  
Copied!

The data crawler above does the following:

Starts from the first page of the pagination list.
Looks for new pagination URLs on the current page.
Adds them to the scraping queue.
Scrapes data from the current page.
Repeats the previous four steps for each page in the queue until there are none there or it visited a number limit of pages.

Since ScrapingCourse.com consists of 12 pages, set limit to 12 to scrape data from all products. In this case, product.csv will have a record for each of the 188 products.

Here's the complete code:

                    scraper.cs
                
using HtmlAgilityPack; 
using System.Globalization; 
using CsvHelper;  
namespace SimpleWebScraper 
{ 
	public class Program 
	{ 
		// defining a custom class to store 
		// the scraped data 
		public class Product 
		{ 
			public string? Url { get; set; } 
			public string? Image { get; set; } 
			public string? Name { get; set; } 
			public string? Price { get; set; } 
		} 
		public static void Main() 
		{ 
			// initializing HAP 
			var web = new HtmlWeb(); 
			 
			// creating the list that will keep the scraped data 
			var products = new List<Product>(); 
			// the URL of the first pagination web page 
			var firstPageToScrape = "https://www.scrapingcourse.com/ecommerce/page/1/"; 
			// the list of pages discovered during the crawling task 
			var pagesDiscovered = new List<string> { firstPageToScrape }; 
			// the list of pages that remains to be scraped 
			var pagesToScrape = new Queue<string>(); 
			// initializing the list with firstPageToScrape 
			pagesToScrape.Enqueue(firstPageToScrape); 
			// current crawling iteration 
			int i = 1; 
			// the maximum number of pages to scrape before stopping 
			int limit = 12; 
			// until there is a page to scrape or limit is hit 
			while (pagesToScrape.Count != 0 && i < limit) 
			{ 
				// getting the current page to scrape from the queue 
				var currentPage = pagesToScrape.Dequeue(); 
				// loading the page 
				var currentDocument = web.Load(currentPage); 
				// selecting the list of pagination HTML elements 
				var paginationHTMLElements = currentDocument.DocumentNode.QuerySelectorAll("a.page-numbers"); 
				// to avoid visiting a page twice 
				foreach (var paginationHTMLElement in paginationHTMLElements) 
				{ 
					// extracting the current pagination URL 
					var newPaginationLink = paginationHTMLElement.Attributes["href"].Value; 
					// if the page discovered is new 
					if (!pagesDiscovered.Contains(newPaginationLink)) 
					{ 
						// if the page discovered needs to be scraped 
						if (!pagesToScrape.Contains(newPaginationLink)) 
						{ 
							pagesToScrape.Enqueue(newPaginationLink); 
						} 
						pagesDiscovered.Add(newPaginationLink); 
					} 
				} 
				// getting the list of HTML product nodes 
				var productHTMLElements = currentDocument.DocumentNode.QuerySelectorAll("li.product"); 
				// iterating over the list of product HTML elements 
				foreach (var productHTMLElement in productHTMLElements) 
				{ 
					// scraping logic 
					var url = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("a").Attributes["href"].Value); 
					var image = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("img").Attributes["src"].Value); 
					var name = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("h2").InnerText); 
					var price = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector(".price").InnerText); 
					var product = new Product() { Url = url, Image = image, Name = name, Price = price }; 
					products.Add(product); 
				} 
				// incrementing the crawling counter 
				i++; 
			} 
			// opening the CSV stream reader 
			using (var writer = new StreamWriter("products.csv")) 
			using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) 
			{ 
				// populating the CSV file 
				csv.WriteRecords(products); 
			} 
		} 
	} 
}

  
  

  
Copied!

Way to go! You're now able to build a web scraping C# app that can scrape a complete website!

Avoid Being Blocked

Your data scraper in C# may fail. This is due to the several anti-scraping mechanisms websites might adopt. There are many anti-scraping techniques your script should be ready for. Use ZenRows to easily get around them!

The most basic technique is to block HTTP requests based on the value of their headers. This generally happens when the requests use an invalid User-Agent value.

The User-Agent header contains info to qualify where the request comes from. Typically, the accepted ones refer to popular browsers and OS. Scraping libraries tend to use placeholder User-Agents that can easily expose your spider.

You can globally set a valid User-Agent in Html Agility Pack with the line below:

                    program.cs
                
// setting a global User-Agent header in HAP 
web.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36";

Copied!

The final code looks like this after adding the User Agent:

                    scraper.cs
                
using HtmlAgilityPack; 
using System.Globalization; 
using CsvHelper;  
namespace SimpleWebScraper 
{ 
	public class Program 
	{ 
		// defining a custom class to store 
		// the scraped data 
		public class Product 
		{ 
			public string? Url { get; set; } 
			public string? Image { get; set; } 
			public string? Name { get; set; } 
			public string? Price { get; set; } 
		} 
		public static void Main() 
		{ 
			// initializing HAP 
			var web = new HtmlWeb(); 
			// setting a global User-Agent header 
			web.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"; 
			// creating the list that will keep the scraped data 
 
			var products = new List<Product>(); 
			// the URL of the first pagination web page 
			var firstPageToScrape = "https://www.scrapingcourse.com/ecommerce/page/1/"; 
			// the list of pages discovered during the crawling task 
			var pagesDiscovered = new List<string> { firstPageToScrape }; 
			// the list of pages that remains to be scraped 
			var pagesToScrape = new Queue<string>(); 
			// initializing the list with firstPageToScrape 
			pagesToScrape.Enqueue(firstPageToScrape); 
			// current crawling iteration 
			int i = 1; 
			// the maximum number of pages to scrape before stopping 
			int limit = 12; 
			// until there is a page to scrape or limit is hit 
			while (pagesToScrape.Count != 0 && i < limit) 
			{ 
				// getting the current page to scrape from the queue 
				var currentPage = pagesToScrape.Dequeue(); 
				// loading the page 
				var currentDocument = web.Load(currentPage); 
				// selecting the list of pagination HTML elements 
				var paginationHTMLElements = currentDocument.DocumentNode.QuerySelectorAll("a.page-numbers"); 
				// to avoid visiting a page twice 
				foreach (var paginationHTMLElement in paginationHTMLElements) 
				{ 
					// extracting the current pagination URL 
					var newPaginationLink = paginationHTMLElement.Attributes["href"].Value; 
					// if the page discovered is new 
					if (!pagesDiscovered.Contains(newPaginationLink)) 
					{ 
						// if the page discovered needs to be scraped 
						if (!pagesToScrape.Contains(newPaginationLink)) 
						{ 
							pagesToScrape.Enqueue(newPaginationLink); 
						} 
						pagesDiscovered.Add(newPaginationLink); 
					} 
				} 
				// getting the list of HTML product nodes 
				var productHTMLElements = currentDocument.DocumentNode.QuerySelectorAll("li.product"); 
				// iterating over the list of product HTML elements 
				foreach (var productHTMLElement in productHTMLElements) 
				{ 
					// scraping logic 
					var url = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("a").Attributes["href"].Value); 
					var image = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("img").Attributes["src"].Value); 
					var name = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector("h2").InnerText); 
					var price = HtmlEntity.DeEntitize(productHTMLElement.QuerySelector(".price").InnerText); 
					var product = new Product() { Url = url, Image = image, Name = name, Price = price }; 
					products.Add(product); 
				} 
				// incrementing the crawling counter 
				i++; 
			} 
			// opening the CSV stream reader 
			using (var writer = new StreamWriter("products.csv")) 
			using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) 
			{ 
				// populating the CSV file 
				csv.WriteRecords(products); 
			} 
		} 
	} 
}

  
  

  
Copied!

Wonderful! Less than 100 lines of code are enough to build a web scraper in C#! Now, all HTTP requests performed by HAP will seem to come from Chrome 124.

Scraping a Dynamic-Content Website with a Headless Browser in C#

Static-content sites have all their content embedded in the HTML pages returned by the server. This makes them an easy scraping target for any HTML parsing library.

Dynamic content websites use JavaScript for rendering or retrieving data. That's because they rely on JavaScript to dynamically retrieve all or part of the content. Scraping such websites requires a tool that can run JavaScript, like a headless browser. If you're not familiar with this term, a headless browser is a programmable browser with no GUI.

With more than 65 million downloads, the most used headless browser library for C# is Selenium. Install Selenium.WebDriver's NuGet package.

                    program.cs
                
dotnet add package Selenium.WebDriver

Copied!

Use Selenium in headless mode to scrape data from ScrapingCourse.com with the following logic:

                    program.cs
                
using CsvHelper; 
using System.Globalization; 
using OpenQA.Selenium; 
using OpenQA.Selenium.Chrome; 
 
namespace SimpleWebScraper 
{ 
	public class Program 
	{ 
		public class Product 
		{ 
			public string? Url { get; set; } 
			public string? Image { get; set; } 
			public string? Name { get; set; } 
			public string? Price { get; set; } 
		} 
 
		public static void Main() 
		{ 
			var products = new List<Product>(); 
 
			// to open Chrome in headless mode 
			var chromeOptions = new ChromeOptions(); 
			chromeOptions.AddArguments("headless"); 
 
			// starting a Selenium instance 
			using (var driver = new ChromeDriver(chromeOptions)) 
			{ 
				// navigating to the target page in the browser 
				driver.Navigate().GoToUrl("https://www.scrapingcourse.com/ecommerce/"); 
 
				// getting the HTML product elements 
				var productHTMLElements = driver.FindElements(By.CssSelector("li.product")); 
				// iterating over them to scrape the data of interest 
				foreach (var productHTMLElement in productHTMLElements) 
				{ 
					// scraping logic 
					var url = productHTMLElement.FindElement(By.CssSelector("a")).GetAttribute("href"); 
					var image = productHTMLElement.FindElement(By.CssSelector("img")).GetAttribute("src"); 
					var name = productHTMLElement.FindElement(By.CssSelector("h2")).Text; 
					var price = productHTMLElement.FindElement(By.CssSelector(".price")).Text; 
 
					var product = new Product() { Url = url, Image = image, Name = name, Price = price }; 
 
					products.Add(product); 
				} 
			} 
 
			// export logic 
			using (var writer = new StreamWriter("products.csv")) 
			using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture)) 
			{ 
				csv.WriteRecords(products); 
			} 
		} 
	} 
}

  
  

  
Copied!

The Selenium FindElements() function allows instructing the browser to look for HTML nodes. Thanks to it, you can select the product HTML elements via a CSS selector query. Then, iterate over them in a foreach loop. Apply GetAttribute() and use Text to extract the data of interest.

Scraping a website in C# with HAP or Selenium is about the same, code-wise. The difference is in the way they run the scraping logic. HAP parses HTML pages to extract data from them and Selenium runs the scraping statements in a headless browser.

Thanks to Selenium, you can crawl dynamic-content websites and interact with web pages in a browser as a real user would. This also means that your script is less likely to be detected as a bot since Selenium makes it easier to scrape a web page without getting blocked.

Html Agility Pack doesn't come with complete browser functionality, so you can only use HAP to scrape static-content websites. and it doesn't involve the resource overhead to run a browser typical of Selenium.

Other Web Scraping Libraries in C#

Other tools to consider when it comes to web scraping with C# are:

ZenRows: A fully-featured easy-to-use API to make extracting data from web pages easy. ZenRows offers an automatic bypass for any anti-bot or anti-scraping system. Plus, it comes with rotating proxies, headless browser functionality, and a 99% uptime guarantee.
Puppeteer Sharp: The .NET port of the popular Puppeteer Node.js library. With it, you can instruct a headless Chromium browser to perform testing and scraping.
AngleSharp: An open-source .NET library for parsing and manipulating XML and HTML. It allows you to extract data from a website and select HTML elements via CSS selectors.

This was a short reminder that there are other useful tools for data scraping with C#. Read our guide on the best C# web scraping libraries.

Featured

Web Scraping with C in 2026

Get started with web scraping in C following this step-by-step tutorial! Learn how to scrape a site with the libcurl and libxml2 libraries.

Conclusion

Our step-by-step tutorial covered everything you need to know about web scraping in C#. First, we learned the basics and then tackled the most advanced C# web scraping concepts.

As a recap, you now know:

How to do basic web scraping in C# with Html Agility Pack.
How to scrape an entire website through web crawling.
When you need to use a C# headless browser solution.
How to extract data from dynamic-content websites with Selenium.

Web data scraping using C# is a challenge. That's due to the many anti-scraping technologies websites now use. Bypassing them all isn't easy, and you always need to find a workaround. Avoid all this with a complete C# web scraping API, like ZenRows. Thanks to it, you perform data scraping via API calls and forget about anti-bot protections.

Frequent Questions

How Do You Scrape Data From a Website in C#?

Scraping data from the web in C# happens as in the other programming languages. With a C# web scraping library, you can connect to the desired website, select HTML elements from its DOM, and retrieve data.

Is C# Good for Web Scraping?

Yes, it is! C# is a general-purpose programming language that enables you to do web scraping. C# has a large and active community that developed many libraries to help you achieve your scraping goals.

What Is the Best Way to Scrape With C#?

Using one of the many NuGet libraries for scraping in C# makes everything easier. Some of the most popular C# libraries to support your data crawling project are Selenium, ScrapySharp, and Html Agility Pack.