How to Use Puppeteer Stealth: A Plugin for Scraping
Puppeteer is a fantastic headless browser library, yet it can easily be detected and blocked by anti-scraping measures. This is where Puppeteer Extra, with the help of plugins like Stealth, plays a key role.
This tutorial introduces Puppeteer Stealth and how to scrape web pages with it. Let's dive in!
What Is Puppeteer Extra?
Puppeteer Extra is an open-source library built to extend the functionality of the popular Puppeteer headless browser.
Here's a list of some of the main plugins you can use with Puppeteer Extra and what they do:
- Stealth plugin hides Puppeteer's automation properties by masking the subtle differences between headless and regular Chrome browsers.
- AdBlocker plugin blocks ads and trackers.
- User Data Dir plugin maintains consistent browser data and settings between sessions.
- reCAPTCHA plugin solves hCAPTCHA and reCAPTCHAs automatically.
- Block Resource plugin intercepts and blocks unwanted resources, including images, fonts, CSS, etc.
- DevTools plugin creates a secure portal to Chrome DevTools APIs to allow debugging and custom profiling from anywhere.
We'll focus on how to avoid detection with Puppeteer.
What Is Puppeteer Stealth?
Puppeteer Stealth, also known as puppeteer-extra-plugin-stealth, is an extension built on top of Puppeteer Extra that uses different techniques to hide properties that would otherwise flag your request as a bot. That makes it harder for websites to detect your scraper.
Let's see it in action.
What Does Puppeteer Stealth Do?
While web scraping with a headless browser gives you browser-like access, websites also get code execution access. That means they can leverage various browser fingerprinting scripts to gather data that can identify your automated browser.
Puppeteer Stealth is crucial here. Its goal is to mask some default headless properties, such as headless: true
, navigator.webdriver: true
and request headers, to crawl below the radar.
That's possible thanks to the extension modules.
Built-in Evasion Modules
Built-in evasion modules are pre-packaged plugins that drive the Puppeteer Stealth functionality. As stated earlier, base Puppeteer has leaks or properties that flag it as a bot, which the Stealth plugin aims to fix.
Each Puppeteer Stealth evasion module is designed to plug a particular leak. Take a look below:
-
iframe.contentWindow fixes the
HEADCHR_iframe
detection by modifyingwindow.top
andwindow.frameElement
. -
Media.codecs modifies
codecs
to support what actual Chrome supports. - Navigator.hardwareConcurrency sets the number of logical processors to four.
-
Navigator.languages modifies the
languages
property to allow custom languages. -
Navigator.plugin emulates
navigator.mimeTypes
andnavigator.plugins
with functional mocks to match standard Chrome used by humans. -
Navigator.permissions masks the
permissions
property to pass the permissions test. -
Navigator.vendors makes it possible to customize the
navigator.vendor
property. -
Navigator.webdriver masks
navigator.webdriver
. -
Sourceurl hides the
sourceurl
attribute of the Puppeteer script. - User-agent-override modifies the user-agent components.
-
Webgl.vendor changes the
Vendor/Renderer
property from Google, which is the default for Puppeteer headless. -
Window.outerdimensions adds the missing
window.outerWidth
orwindow.outerHeight
properties.
How to Web Scrape with Puppeteer Stealth
Before we dive into Puppeteer in stealth mode, it's essential to explore web scraping with the base headless browser. As a target, we'll use NowSecure, a website that throws anti-bot challenges at every request and displays a you passed
message if you're successful.
Let's begin!
npm install puppeteer
- Import Puppeteer and open an
async
function where you'll write your code.
import puppeteer from 'puppeteer';
(async () => {
}
- Launch a browser, create a new page, and navigate to your target URL.
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://nowsecure.nl/');
- Set the screen size, wait for the page to load, take a screenshot, and close your browser.
// Set screen size
await page.setViewport({width: 1280, height: 720});
//wait for page to load
await page.waitForTimeout(30000);
// Take screenshot
await page.screenshot({ path: 'image.png', fullPage: true });
// Closes the browser and all of its pages
await browser.close();
})();
Putting all of it together, here's your complete code:
import puppeteer from 'puppeteer';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
//navigate to target URL
await page.goto('https://nowsecure.nl/');
// Set screen size
await page.setViewport({width: 1280, height: 720});
//wait for page to load
await page.waitForTimeout(30000);
// Take screenshot
await page.screenshot({ path: 'image.png', fullPage: true });
// Closes the browser and all of its pages
await browser.close();
})();
And this is the screenshot of the web page:

The result above shows that our Puppeteer script got blocked since we couldn't bypass anti-bot detection.
Now, let's try scraping the same website using Puppeteer Stealth.
Here are the steps you must take:
Step 1: Install Puppeteer-Stealth
As mentioned earlier, we need the Puppeteer Extra library to use Puppeteer Stealth. So, install both using the following command.
npm install puppeteer-extra puppeteer-extra-plugin-stealth
Step 2: Configure Puppeteer-Stealth
To configure Puppeteer Stealth, start by importing Puppeteer Extra.
const puppeteer = require('puppeteer-extra')
Then add the Stealth plugin and use it in default mode, which ensures your script uses all evasion modules.
// add stealth plugin and use defaults (all evasion techniques)
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
puppeteer.use(StealthPlugin())
You can refer to this readme file if you want to pass in custom evasion modules.
Next, import executablePath
(the path to your Chrome executable). For example, on Windows, the path could look like this:
const executablePath = 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe'; // Replace this with the path to your Chrome executable
Launch Puppeteer Stealth while specifying the execution path and open an async
function.
puppeteer.launch({ executablePath: executablePath() }).then(async browser => {
}
Step 3: Take a Screenshot
Like in our base Puppeteer script, create a new page, set the screen size, and navigate to the target website.
// Create a new page
const page = await browser.newPage();
// Setting page view
await page.setViewport({ width: 1280, height: 720 });
// Go to the website
await page.goto('https://nowsecure.nl/');
Lastly, wait for the page to load and take a screenshot.
// Wait for security check
await page.waitForTimeout(10000);
await page.screenshot({ path: 'image.png', fullPage: true });
await browser.close();
});
Here's what our result looks like:

Congratulations! You prevented bot detection using Puppeteer Extra Stealth.
Let's take it a step further and scrape the page.
Step 4: Scrape the Page
First, right-click on an element you want to scrape and select "Inspect". That will open the Chrome DevTools, and you'll find the selectors in the Elements
tab.

Next, copy the selectors and use each to get its text. The querySelector
method works perfectly for this purpose.
await page.goto('https://nowsecure.nl/');
await page.waitForTimeout(10000);
// Get title text
title = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > header > div > h3').textContent;
});
// Get message text
msg = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > main > h1').textContent;
});
// get state text
state = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > main > p:nth-child(2)').textContent;
});
// print out the results
console.log(title, '\n', msg, '\n', state);
await browser.close();
});
The script uses the .textContent
method to get the text for each element. The same process can be repeated and saved to a variable.
The complete code looks like this:
const puppeteer = require('puppeteer-extra');
// Add stealth plugin and use defaults
const pluginStealth = require('puppeteer-extra-plugin-stealth');
const {executablePath} = require('puppeteer');
// Use stealth
puppeteer.use(pluginStealth());
//Import your executablePath
const executablePath = 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe'; // Replace this with the path to your Chrome executable
// Launch pupputeer-stealth
puppeteer.launch({ headless:true, executablePath: executablePath() }).then(async browser => {
// Create a new page
const page = await browser.newPage();
// Setting page view
await page.setViewport({ width: 1280, height: 720 });
// Go to the website
await page.goto('https://nowsecure.nl/');
// Wait for security check
await page.waitForTimeout(10000);
// Get title text
title = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > header > div > h3').textContent;
});
// Get message text
msg = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > main > h1').textContent;
});
// get state text
state = await page.evaluate(() => {
return document.querySelector('body > div.nonhystericalbg > div > main > p:nth-child(2)').textContent;
});
// print out the results
console.log(title, '\n', msg, '\n', state);
await browser.close();
});
After running the script, the output should look like this:

Awesome! You solved your main problem and successfully avoided Puppeteer bot detection.
Limitations of puppeteer-extra-plugin-stealth and a Solution
While Puppeteer Stealth does a lot to avoid detection, it has its limitations:
- It can't avoid advanced anti-bots. For example, your script will easily get detected and blocked if you use Puppeteer Stealth to try to bypass Cloudflare or bypass DataDome.
- It can get extremely slow and, therefore, difficult to scale.ย
- As with other headless browsers, it's difficult to debug.ย
Let's see an example of Puppeteer Stealth against Cloudflare.
We'll try scraping Okta, a Cloudflare-protected website, using the same code as before.
// same as before
puppeteer.launch({ headless:true, executablePath: executablePath() }).then(async browser => {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 720 });
// Go to the website
await page.goto('https://okta.com/');
// Wait for security check
await page.waitForTimeout(10000);
await page.screenshot({ path: 'image.png', fullPage: true });
await browser.close();
});
Here's our result:

We got blocked straight off! Fortunately, there's a quick solution. With the ZenRows library, you'll bypass even the most complicated anti-bots.
Let's see it in action.
First, sign up to get your free API key. Then, install ZenRows by entering the following command in your terminal:
npm install zenrows
Next, import ZenRows and open an async
function.
// import ZenRows
const { ZenRows } = require("zenrows");
(async () => {
}
Then, create a new ZenRows instance, define your target URL, and the required parameters: antibot: true
and premium_proxy: true
. For this example, we'll scrape the website that blocked us above.
const { ZenRows } = require("zenrows");
(async () => {
//create new zenrows instance
const client = new ZenRows("Your API Key");
const url = "https://www.okta.com/";
//make request with the required parameters
try {
const { data } = await client.get(url, {
"antibot": "true",
"premium_proxy": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Here's our result:

How does it feel knowing you can scrape just about any website? Awesome, right?
Conclusion
Puppeteer is a popular web scraping and automation tool. But its default properties make it easy for websites to detect and block your bot. Fortunately, Puppeteer Stealth lets you leverage its evasion modules to stay below the radar.
Yet, Puppeteer Stealth can't keep up with anti-bot measures frequently evolving. Thus, it doesn't work against advanced obstacles. For these cases, consider solutions like ZenRows and use its free trial for your next project.
Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.