6 Tricks to Avoid Detection With Puppeteer

Updated: October 10, 2024 · 6 min read

Table of contents

Can anti-bots detect Puppeteer?
ZenRows
Use proxies
Use custom request headers
Block certain requests
Delay requests to mimic human
Puppeteer Stealth
Conclusion

Is your Puppeteer scraper frequently blocked by anti-bots? You're not alone. Bypassing anti-bots like Cloudflare or PerimeterX is becoming more challenging as security measures evolve. We'll help you with the six best ways to avoid detection with Puppeteer while scraping. This includes a pro solution that guarantees high success.

Can Puppeteer Be Detected by Anti-Bots?

The short answer is yes. Although Puppeteer is a JavaScript library that automates browser-user interactions, anti-bots often detect its automation properties, which usually results in blocking.

While controlling the browser, Puppeteer introduces automation-specific attributes, such as setting the navigator.webdriver property to true and using the HeadlessChrome flag in the User-Agent string (in headless mode). Unusual browser fingerprint elements, such as missing plugins and irregular rendering behaviors, also signal automation.

Most anti-bot systems compare these automation properties against databases of allowed and disallowed characteristics to detect suspicious behavior.

Puppeteer-controlled browsers, especially in their default configuration, often fall into the disallowed categories due to the previous bot-like properties. So, your Puppeteer web scraper has a higher chance of being flagged as a bot.

Let's prove the above point by opening this Cloudflare-protected Challenge page with Puppeteer. Install the Puppeteer if you've not done so already:

                    Terminal
                
npm install puppeteer

Copied!

Now, try accessing the protected page with the following code that screenshots the web page:

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

(async () => {
    // set up browser environment
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // navigate to a URL
    await page.goto('https://www.scrapingcourse.com/cloudflare-challenge', {
        waitUntil: 'load',
    });

    // take the page screenshot
    await page.screenshot({ path: 'screenshot.png' });

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

We got the following screenshot showing that Puppeteer got blocked:

scrapingcourse cloudflare blocked screenshot — Click to open the image in full screen

So, how can you mitigate such limitations and bypass anti-bot detection with Puppeteer? We'll show you 6 trusted tricks.

ZenRows - the ultimate solution.
Use proxies.
Use custom request headers.
Delay requests to mimic human behavior.
Block certain requests.
Puppeteer Stealth.

Featured

How to Solve Puppeteer 403 Error

Tackle Puppeteer 403 Forbidden errors with ease: Discover the best practices for adjusting request frequency and optimizing headers.

1. ZenRows — The Ultimate Solution

The easiest way to avoid anti-bot detection in Puppeteer is by using the ZenRows' Scraping Browser. It's a useful tool for bypassing anti-bots while scraping with browser automation libraries such as Puppeteer.

The ZenRows Scraping Browser fortifies your Puppeteer browser instance with advanced evasions to mimic an actual user and bypass anti-bot checks. These include fixing core fingerprinting issues, such as patching the navigator fields, replacing missing plugins like the PDF viewer, fixing WebGL and Canvas rendering, and more.

The Scraping Browser runs your browser instance in the cloud, allowing you to scale efficiently without impacting your machine's memory. It also handles other tasks, such as residential proxy rotation under the hood, to distribute your requests efficiently and evade IP bans or geo-restrictions.

Integrating the Scraping Browser into your existing Puppeteer scraper requires only a single line of code.

Let's see how it works by requesting the protected website that previously blocked our Puppeteer scraper (the Cloudflare challenge page).

First, install puppeteer-core, a Puppeteer version that doesn't include pre-installed browser binaries:

                    Terminal
                
npm install puppeteer-core

Copied!

ZenRows scraping browser — Click to open the image in full screen

Update the previous code by importing puppeteer-core and connecting Puppeteer through the browser URL, as shown:

                    Example
                
// npm install puppeteer-core
const puppeteer = require('puppeteer-core');

// define your connection URL
const connectionURL = 'wss://browser.zenrows.com?apikey=<YOUR_ZENROWS_API_KEY>';

(async () => {
    // set up browser environment
    const browser = await puppeteer.connect({
        browserWSEndpoint: connectionURL,
    });

    // create a new page
    const page = await browser.newPage();

    // navigate to a URL
    await page.goto('https://www.scrapingcourse.com/cloudflare-challenge', {
        waitUntil: 'networkidle0',
    });

    // wait for the challenge to resolve
    await new Promise(function (resolve) {
        setTimeout(resolve, 10000);
    });

    //take page screenshot
    await page.screenshot({ path: 'screenshot.png' });
    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

The above code returns a screenshot of the protected website's homepage. See the result below:

cloudflare-challenge-passed — Click to open the image in full screen

Congratulations 🎉! You've bypassed anti-bot protection using a Puppeteer-ZenRows one-liner integration.

Note

If your Puppeteer scraper is still getting blocked, turn to the ZenRows Scraper API for a 98.7% guarantee of bypassing any anti-bot at scale.

While this is the easiest way to bypass anti-bots with Puppeteer, you can explore other manual methods if you prefer a hands-on approach to setting things up. We'll show you how they work in the next sections.

2. Use Proxies

One of the most widely adopted anti-bot strategies is IP tracking, where the bot detection system is triggered when the IP exceeds a rate limit or the request comes from a blocked region.

To avoid detection, you can use a proxy in Puppeteer, which acts as a gateway between your scraper and the server. So when you send a request to the server, it's routed via the proxy, and then the response data is sent to you.

There are two proxy categories in terms of pricing: free and premium.

While free proxies are cost-effective, they're public and have a short lifespan, making them unsuitable for serious scraping applications.

The best choice is premium residential proxies. These proxies efficiently distribute traffic across IPs assigned to everyday internet users by network providers, reducing the chances of triggering anti-bot systems.

To add a free proxy to Puppeteer, include an args option containing the proxy details in the browser method. The following scraper uses a free proxy from the Free Proxy List and may not work at the time of reading. Feel free to grab a new one from that website:

Avoid getting blocked with headless browsers

ZenRows unlocks all the data you need by mimicking human behavior, loading dynamic content, and interacting with any webpage.

Try for Free

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

(async () => {
    // define your proxy URL
    const proxy = 'http://178.128.113.118:23128';

    // set up the browser environment with the proxy URL
    const browser = await puppeteer.launch({
        args: [`--proxy-server=${proxy}`],
    });
    const page = await browser.newPage();

    // navigate to a URL
    await page.goto('https://httpbin.io/ip', {
        waitUntil: 'load',
    });

    //... your scraping logic

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

The above code now routes requests through the proxy's IP address.

Adding a premium proxy requires an extra step involving request interception with proxy credentials, such as username and password. Here's a sample code demonstrating how to set up an authenticated premium proxy in Puppeteer:

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

(async () => {
    // define your proxy credentials
    const proxyURL = 'http://<PROXY_URL>:<PROXY_PORT>';
    const proxyUsername = '<PROXY_USERNAME>';
    const proxyPassword = '<PROXY_PASSWORD>';

    // set up the browser environment with the proxy URL
    const browser = await puppeteer.launch({
        args: [`--proxy-server=${proxyURL}`],
    });
    const page = await browser.newPage();

    // intercept the request with proxy credentials
    await page.authenticate({
        username: proxyUsername,
        password: proxyPassword,
    });

    // continue navigating to a URL
    await page.goto('https://httpbin.io/ip', {
        waitUntil: 'load',
    });

    //... your scraping logic

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

Most premium proxy providers also offer extra functionalities, such as geo-targeting, to access geo-blocked content.

Check our article on the best web scraping proxies to learn more.

3. Use Custom Request Headers

Request headers contain context and metadata information about the HTTP request. So, it can hint the anti-bot whether a request originates from a bot or a regular browser. You can reduce detection risks by including appropriate headers in Puppeteer's HTTP requests.

Since Puppeteer works under HeadlessChrome by default, modifying it with custom headers like User-Agent and Referer makes the request more legitimate and fixes some fingerprinting issues.

The User-Agent header identifies the client's application, operating system, and vendor, while the Referer header indicates the URL of the page from which the request originated.

There are several methods to add a request header in Puppeteer, but the easiest one is to add it while opening a new page.

The code below modifies Puppeteer's User Agent and Referer headers and requests https://httpbin.io/headers, a test website that returns the request headers:

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

// define your custom headers
const requestHeaders = {
    'user-agent':
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
    Referer: 'https://www.google.com/',
};

(async () => {
    // set up browser environment
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // intercept the request with the custom headers
    await page.setExtraHTTPHeaders({ ...requestHeaders });

    // continue navigating to a URL
    await page.goto('https://httpbin.io/headers', {
        waitUntil: 'load',
    });

    // get the page content and output it
    const bodyContent = await page.$eval('pre', (element) => element.innerHTML);
    console.log(bodyContent);

    //... other scraping logic

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

The above code outputs the modified request headers as shown:

                    Output
                
{
    "headers": {
        // ... other headers omitted for brevity

        "Referer": ["https://www.google.com/"],
        "User-Agent": [
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
        ],
    }
}

  
  

  
Copied!

There are more headers you can add to Puppeteer. Check our guide on the common web scraping request headers for more.

4. Block Certain Requests

Blocking certain requests, such as specific scripts, can help suppress resources that trigger browser fingerprinting. By suppressing fingerprinting, you can reduce the information that anti-bot systems can gather about your scraper.

Note

Some anti-bots might still use resource blocking/limiting as a bot detection metric. So, use this technique cautiously while scraping with Puppeteer.

While this approach optimizes performance and can reduce fingerprinting, there's no guarantee that the anti-bot won't detect you.

For instance, the Puppeteer scraper below blocks ads, analytics, and social media-embedded scripts using Puppeteer's built-in request interception:

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

(async () => {
    // set up the browser environment
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // enable request interception
    await page.setRequestInterception(true);

    // block non-essential third-party scripts
    page.on('request', (request) => {
        const url = request.url();

        // specify patterns for scripts you want to block
        if (
            url.includes('analytics') ||
            url.includes('ads') ||
            url.includes('social')
        ) {
            // block the request
            request.abort();
        } else {
            // allow the request
            request.continue();
        }
    });

    // navigate to the target page
    await page.goto('https://www.scrapingcourse.com/ecommerce/');

    //... your scraping logic

    // close the browser
    await browser.close();
})();

  
  

  
Copied!

5. Delay Requests to Mimic Human Behavior

As previously discussed, an anti-bot can track a user's activity through the number of requests they send. Since real users don't send hundreds of requests per second, taking breaks between requests is a good way to simulate regular user behavior and avoid detection in Puppeteer.

When navigating multiple pages, consider setting intervals between requests or waiting a few moments before clicking navigation buttons to further mimic human patterns.

For example, the following code uses a custom getRandomDelay function to pause randomly between 1 and 5 seconds before clicking the next page button:

                    Example
                
// npm install puppeteer
const puppeteer = require('puppeteer');

// function to create a random delay
function getRandomDelay(min = 1000, max = 5000) {
    return Math.floor(Math.random() * (max - min + 1)) + min;
}

(async () => {
    // start Puppeteer in headless mode and open the target website
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // navigate to the initial page
    await page.goto('https://www.scrapingcourse.com/ecommerce/');

    let hasNextPage = true;

    while (hasNextPage) {
        try {
            // wait for the "next" button
            const nextButton = await page.$('a.next');

            // check if the "next" button exists
            if (nextButton) {
                // introduce the random delay before clicking the next page
                const randomDelay = getRandomDelay();
                await new Promise(function (resolve) {
                    setTimeout(resolve, randomDelay);
                });

                // output the current wait time to confirm
                console.log(
                    `waiting for ${getRandomDelay()}ms before clicking the next page...`
                );

                // click the "next" button
                await nextButton.click();

                // wait for the page to load
                await page.waitForNavigation({ waitUntil: 'load' });
            } else {
                // if no next button, stop the loop
                hasNextPage = false;
                console.log('no more pages to navigate.');
            }
        } catch (error) {
            // if there's an error (like timeout), stop the loop
            console.log('error navigating to the next page:', error);
            hasNextPage = false;
        }
    }

    // close the browser
    await browser.close();
})();

  
  

  
Copied!

The above code will apply the random wait time to Puppeteer's click action. Try executing the request in the GUI mode to see the process in action.

6. Puppeteer-Stealth

Puppeteer has many detectable bot-like properties by default. The Puppeteer Stealth Plugin is a modified version of Puppeteer that features various anti-bot evasions to reduce the chances of detection.

Since Puppeteer Stealth is a plugin, it doesn't change Puppeteer's standard API methods. It only patches bot-like properties to spoof an actual browser. These include changing the navigator.webdriver field to false, emulating navigator.mimeTypes in headless mode, mocking the headful browser runtime environment in headless mode, and more.

Let's see how Puppeteer Stealth works step-by-step by accessing CoinTracker, a website with simple Cloudflare protection.

The first step is to install the library:

                    Terminal
                
npm install puppeteer puppeteer-extra puppeteer-extra-plugin-stealth

Copied!

Import these libraries and configure Puppeteer to use the Stealth plugin:

                    Example
                
// npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

// add the stealth plugin
puppeteer.use(StealthPlugin());

Copied!

Next, launch the browser instance, visit the target web page and take a screenshot:

                    Example
                
// ...

(async () => {
    // set up browser environment
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // navigate to a URL
    await page.goto('https://www.cointracker.io/', {
        waitUntil: 'load',
    });

    // take the page screenshot
    await page.screenshot({ path: 'screenshot.png' });

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

Here's the complete code after combining both snippets:

                    Example
                
// npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

// add the stealth plugin
puppeteer.use(StealthPlugin());

(async () => {
    // set up browser environment
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // navigate to a URL
    await page.goto('https://www.cointracker.io/', {
        waitUntil: 'load',
    });

    // take the page screenshot
    await page.screenshot({ path: 'screenshot.png' });

    // close the browser instance
    await browser.close();
})();

  
  

  
Copied!

Puppeteer Stealth successfully bypassed the anti-bot protection on CoinTracker, as shown in the following screenshot:

Coni Tracker Puppeteer Stealth Screenshot — Click to open the image in full screen

Congratulations!

Keep in mind that WAFs and anti-bot security measures keep evolving and are becoming increasingly complex to bypass. Open-source bypass tools like Puppeteer Stealth often struggle to keep up with these frequent updates and still have detectable bot-like footprints. So, they're less reliable against advanced anti-bot measures, especially when scraping at scale.

For example, try to replace the above URL with the Cloudflare challenge page, and you'll see that it blocks Puppeteer Stealth:

Feel free to read our detailed guide on patching Puppeteer Stealth further to boost its anti-bot bypass capability.

Conclusion

There are different methods to avoid detection with Puppeteer, and we discussed the best and easiest ways to go about it in this article. You can use proxies, headers, delay requests or Puppeteer-Stealth to get the job done, but there are limitations.

It's best to combine these techniques for the best result. However, a common limitation is that they're unreliable against advanced anti-bot measures.

That said, the most straightforward and most recommended approach is integrating the ZenRows Scraping Browser to handle all the stealth tweaks for you while you focus on your scraping logic.

Try ZenRows for free without a credit card!

Frequent Questions

1. Why Does Puppeteer Get Detected by Anti-Bots?

Puppeteer is often detected because it sets automation-specific properties like navigator.webdriver to true and uses the HeadlessChrome flag in the User-Agent header. Missing browser plugins, irregular rendering, and predictable behavior further signal automation. Tools like the ZenRows Scraping Browser can patch these issues to avoid detection.

2. Is Puppeteer Stealth Enough to Bypass Advanced Anti-Bot Systems?

Puppeteer Stealth helps reduce detection by patching bot-like properties, but it struggles with advanced anti-bot measures like those used by Cloudflare. For consistent results, tools like ZenRows offer a more robust solution with premium proxies, browser emulation, and anti-bot bypass capabilities.

3. How Can I Avoid Detection While Using Puppeteer?

To avoid detection with Puppeteer:

Use the ZenRows Scraping Browser to fortify your Puppeteer setup with advanced stealth techniques, such as fingerprint patching and proxy rotation.
Add custom headers like User-Agent and Referer to mimic real users.
Rotate proxies to avoid IP-based blocking.
Implement delays and random interactions to simulate human behavior.
Use Puppeteer Stealth for additional evasion.

4. Is the ZenRows Scraping Browser Better Than Puppeteer Stealth?

Yes, the ZenRows Scraping Browser offers a more robust solution. While Puppeteer Stealth reduces bot-like signals, it struggles with advanced anti-bot systems like Cloudflare. The ZenRows Scraping Browser integrates seamlessly with Puppeteer, fixes fingerprinting issues, rotates residential proxies, and bypasses anti-bots at scale with minimal setup.