PuppeteerSharp Proxy: How to Get and Set It in 2024

November 9, 2023 · 6 min read

The most common challenge when web scraping in C# is getting blocked. Websites often implement various techniques to regulate bot traffic and deny web scraper access. However, the good news is you can overcome this issue by using a proxy with PuppeteerSharp.

Proxies act as intermediaries between you and the target website, enabling you to make requests from different devices and geographical regions. You'll learn how to get and configure a PuppeteerSharp proxy in this tutorial.

How to Set a Proxy with PuppeteerSharp

Here's the step-by-step process of setting up a proxy with PuppeteerSharp. You'll also learn how to rotate multiple proxies to increase your chances of avoiding detection.

Step 1: Get Started with PuppeteerSharp

Let's begin with a basic PuppeteerSharp scraper that makes an HTTP request to a target website.

The following script launches a headless browser, creates a new page, navigates to the target URL (httpbin, an API that returns the web client's IP address), retrieves the page content, and prints it to the console.

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        using var browserFetcher = new BrowserFetcher();
        await browserFetcher.DownloadAsync();
 
        await using var browser = await Puppeteer.LaunchAsync(
            new LaunchOptions { Headless = true });
 
        await using var page = await browser.NewPageAsync();
 
        await page.GoToAsync("https://httpbin.io/ip");
 
        // Get the content of the page
        var pageContent = await page.GetContentAsync();
 
        // Print the page content
        Console.WriteLine(pageContent);
 
        // Close the browser when done
        await browser.CloseAsync();
    }
}

Remark: The code above assumes you've created a console application project and installed PuppeteerSharp. 

Run it, and the result of the request above should be your IP address.

Terminal
{
  "origin": "107.010.84.20"
}

Let's use a proxy next.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Step 2: Set a PuppeteerSharp Proxy

To follow along in this step, you need a proxy, and you can grab a free one from FreeProxyList. To be noted, we recommend using HTTPS proxies because they work for both HTTP and HTTPS requests.

To configure a PuppeteerSharp proxy, you must define your proxy details as a command line argument. For that, PuppeteerSharp provides the LauchAsync method, which allows you to specify various options using the Args property (an array of strings), including proxy settings.

scraper.cs
await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
    Headless = true,
    Args = new[] { "--proxy-server=proxy_ip:proxy_port" }
});

Now, replace proxy_ip:proxy_port with your proxy details (in our case, 8.219.97.248, but you need to pick a fresh one) and add it to the basic request we created earlier, and you'll have the following complete code.

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        // Initialize a browser fetcher to download PuppeteerSharp binaries
        using var browserFetcher = new BrowserFetcher();
        await browserFetcher.DownloadAsync();
 
        // Launch a headless browser instance with specified options
        await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions
        {
            Headless = true, // Run the browser in headless mode (no GUI)
            Args = new[] { "--proxy-server=8.219.97.248:80" } // Configure the proxy server
        });
 
        // Create a new web page
        await using var page = await browser.NewPageAsync();
 
        // Navigate to target URL using the configured proxy (no proxy authentication in this code)
        await page.GoToAsync("https://httpbin.io/ip");
 
        // Retrieve the content of the web page
        var pageContent = await page.GetContentAsync();
 
        // Print the page content (in this case, the IP address)
        Console.WriteLine(pageContent);
 
        // Close the browser when finished
        await browser.CloseAsync();
    }
}

Run the script, and your response should be your proxy's IP address. 

Terminal
{
  "origin": "8.219.974.248:52913"
}

Congrats! You've configured your first PuppeteerSharp proxy.

That said, it's worth noting that free proxies are unreliable, and real-world use cases mostly demand premium web scraping proxies, which often require additional configuration. Let's see how to implement such proxies in PuppeteerSharp.

Step 3: Do Proxy Authentication with PuppeteerSharp

Premium proxies often require you to provide valid credentials, such as username and password, to use its proxy service. This is necessary for security and access control on the part of the proxy providers.

To authenticate a proxy with PuppeteerSharp, you must provide the credentials to the AuthenticateAsync method of the PuppeteerSharp.page class.

scraper.cs
page.AuthenticateAsync(new Credentials {Username = "<username>", Password = "<password>"});

So, if the proxy in step 2 were premium, you'd authenticate it by modifying your code to include the credentials using the AuthenticateAsyc method, like in the code below.

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        // Initialize a browser fetcher to download PuppeteerSharp binaries
        using var browserFetcher = new BrowserFetcher();
        await browserFetcher.DownloadAsync();
 
        // Launch a headless browser instance with specified options
        await using var browser = await Puppeteer.LaunchAsync(new LaunchOptions
        {
            Headless = true, // Run the browser in headless mode (no GUI)
            Args = new[] { "--proxy-server=8.219.97.248:80" } // Configure the proxy server
        });
 
        // Create a new web page 
        await using var page = await browser.NewPageAsync();
 
        // Authenticate with the proxy server using the proxy credentials
        await page.AuthenticateAsync(new Credentials { Username = "<username>", Password = "<password>" });
 
        // Navigate to target URL using the configured proxy
        await page.GoToAsync("https://httpbin.io/ip");
 
        // Retrieve the content of the web page
        var pageContent = await page.GetContentAsync();
 
        // Print the page content (in this case, the IP address)
        Console.WriteLine(pageContent);
 
        // Close the browser when finished
        await browser.CloseAsync();
    }
}

Step 4: Build a Proxy Rotator

Too many requests from the same specific IP address are easily flagged as suspicious activity, and you can get blocked. However, you can avoid that by rotating through multiple proxies. This way, you distribute requests across different IP addresses, reducing the number of requests from a single IP address.

To build a proxy rotator in PuppeteerSharp, first, you need a proxy list, from which you'll randomly select one for each request. You can grab a few from FreeProxyList.

Start by defining your proxy list. 

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        // Define a list of proxy addresses
        var proxies = new List<string>
        {
            "http://34.140.70.242:8080",
            "http://118.69.111.51:8080",
            "http://15.204.161.192:18080",
            "http://186.121.235.66:8080",
        };
//..
 
}

Next, generate a random index and use it to select a proxy from your proxy list.

scraper.cs
//..
 
    // Generate a random index
    var random = new Random();
    int randomIndex = random.Next(proxies.Count);
 
    // Select a random proxy using randomIndex
    string randomProxy = proxies[randomIndex];

After that, create a new PuppeteerSharp browser instance, passing the randomly selected proxy as LaunchOptions.

scraper.cs
//..
 
    // Launch browser instance with randomProxy
    var browserFetcher = new BrowserFetcher();
    await browserFetcher.DownloadAsync();
    var browser = await Puppeteer.LaunchAsync(new LaunchOptions
    {
        Headless = true,
        Args = new[] { $"--proxy-server={randomProxy}" } 
    });

Lastly, update the basic request created earlier with the code blocks above, and you'll have the following complete code. 

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        // Define a list of proxy addresses
        var proxies = new List<string>
        {
            "http://34.140.70.242:8080",
            "http://118.69.111.51:8080",
            "http://15.204.161.192:18080",
            "http://186.121.235.66:8080",
        };
 
        // Generate a random index
        var random = new Random();
        int randomIndex = random.Next(proxies.Count);
 
        // Select a random proxy using the randomIndex
        string randomProxy = proxies[randomIndex];
 
        // Launch browser instance with randomProxy
        var browserFetcher = new BrowserFetcher();
        await browserFetcher.DownloadAsync();
        var browser = await Puppeteer.LaunchAsync(new LaunchOptions
        {
            Headless = true,
            Args = new[] { $"--proxy-server={randomProxy}" }
        });
 
        // Create a new page
        using var page = await browser.NewPageAsync();
 
        // Navigate to target URL
        await page.GoToAsync("https://httpbin.io/ip");
 
        // Retreive page content
        var pageContent = await page.GetContentAsync();
        Console.WriteLine(pageContent);
 
        // Close the browser
        await browser.CloseAsync();
    }
}

To verify it works, make multiple requests. You should get a different IP address for each. Here's the result for two requests:

Terminal
{
  "origin": "34.140.70.242"
}
 
//..
 
{
  "origin": "186.121.235.66"
}

Awesome! You can now easily rotate proxies in PuppeteerSharp.

However, we only used free proxies to show you the basics. As mentioned before, they're unreliable and easily detected by websites. Keep reading for a better-performing solution.

Step 5: The Way to Rotate Proxies in a Real Scenario

In the previous step, we implied that free proxies can easily get blocked. Let's see how they fare in a real-world example (scraping an actual website). In this example, we'll try to scrape G2, a Cloudflare-protected website.

So, replace the target URL in step 4 with https://www.g2.com/. Run your script, and you'll get an error message like the one below.

Terminal
<!DOCTYPE html><html class="no-js" lang="en-US"><!--<![endif]--><head>
<title>Attention Required! | Cloudflare</title>
 
</head>
<body>
      <div>
        <h1 ...>Sorry, you have been blocked</h1>
        <h2 ...><span data-translate="unable_to_access">You are unable to access</span> g2.com</h2>
      </div>
 
//.. 

That proves that real-world scenarios require premium proxies. Two types of proxies are most used for scraping: Residential and datacenter. Residential proxies are most recommended because they use IP addresses associated with real residential devices, making it difficult for websites to detect them as bots. 

To get started, you can check out our list of the best web scraping proxy providers.

That said, configuring premium proxies with Puppeteer can get tedious and difficult to scale. And proxies alone aren't enough for popular websites. Fortunately, you can make things easier by complementing PuppeteerSharp with ZenRows, a web scraping API that offers a residential proxy rotator, as well as all the features you need to avoid getting blocked, including User Agent rotation, anti-CAPTCHA, and more.

Let's see ZenRows in action scraping G2, a well-protected site. To get started, sign up for free, and you'll get to the Request Builder page.

ZenRows Request Builder
Click to open the image in full screen

Paste your target URL (https://www.g2.com/), and check the box for Premium Proxies to auto-rotate your IP address and User Agent header. Then, Select C# as the language you'll use to get your request code generated on the right.

You'll see that RestSharp is suggested, but you can absolutely use PuppeteerSharp. You only need to send a request to the ZenRows API. For that, copy the ZenRows API URL  from the generated request on the right and define it in your PuppeteerSharp code.

Example
https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2F&premium_proxy=true

Then, make a request to the ZenRows API URL. Your code should look like this:

scraper.cs
using PuppeteerSharp;
using System;
using System.Threading.Tasks;
 
class Program
{
    static async Task Main(string[] args)
    {
        // Define the ZenRows API URL
        string zenRowsApiUrl = "https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2F&js_render=true&antibot=true&premium_proxy=true";
 
        using var browserFetcher = new BrowserFetcher();
        await browserFetcher.DownloadAsync();
 
        var browser = await Puppeteer.LaunchAsync(new LaunchOptions
        {
            Headless = true
        });
 
        var page = await browser.NewPageAsync();
 
        // Send a request to the ZenRows API
        await page.GoToAsync(zenRowsApiUrl);
 
        // Retreive page content
        var pageContent = await page.GetContentAsync();
        Console.WriteLine(pageContent);
 
        await browser.CloseAsync();
    }
}

Run the code, and you'll get G2's HTML.

Terminal
<!DOCTYPE html>
//..
 
<title id="icon-label-55be01c8a779375d16cd458302375f4b">G2 - Business Software Reviews</title>
 
//..
 
<h1 ...id="main">Where you go for software.</h1>

Easy, right? That's how it is to rotate proxies at scale with ZenRows.

However, most modern websites use sophisticated anti-bot techniques, and only rotating proxies and header rotation aren't enough. But you can replace PuppeteerSharp with ZenRows to avoid getting blocked.

ZenRows offers the same headless browser functionality as PuppeteerSharp but with the complete toolkit for bypassing any anti-bot system and less overhead. So, you even get to save machine costs as the headless browser is run by ZenRows.

To use ZenRows alone, you only need to copy the generated code on the right for C#.

Conclusion

Configuring a PuppeteerSharp proxy enables you to route your requests through a different IP address. However, making too many requests from a specific IP address can lead to an IP ban. So, you must rotate through multiple proxies for better results. 

That being said, building a PuppeteerSharp proxy rotator can get really tedious and difficult to scale. Fortunately, ZenRows offers an easy way out, an intuitive API that handles everything under the hood, including rotating residential premium proxies. Sign up now to try it for free.

Ready to get started?

Up to 1,000 URLs for free are waiting for you