Playwright vs. Puppeteer in 2024: Which Should You Choose

July 18, 2024 ยท 9 min read

The Playwright vs. Puppeteer debate is a big discussion since both are fantastic Node.js libraries for browser automation. Although they do pretty much the same thing, Puppeteer and Playwright have some notable differences.

Let's run through a quick history here:

Let's run through a quick history here: The Chrome Dev team created Puppeteer in 2017 to cover Selenium's unreliability for browser automation. Microsoft later launched Playwright, which, like Puppeteer, can efficiently run complex automation tests on a browser. Yet this time around, they introduced more functionalities.

So which one is the best?

Let's look at Puppeteer and Playwright's differences to see what makes each library unique.

Playwright vs. Puppeteer: What Are the Major Differences?

Puppeteer and Playwright are headless browsers originally designed for end-to-end automated testing of web apps. They're used for other purposes as well, such as web scraping.ย 

Although they have similar use cases, some key differences between the automation tools are:

  • Playwright supports Python, Java, JavaScript, TypeScript, and C#, while Puppeteer supports only JavaScript and a non-official port (pyppeteer) for Python.
  • Playwright supports three browsers: Chromium, Firefox and WebKit, but Puppeteer supports only Chromium.

Playwright

Playwright is an end-to-end web testing and automation library. Although the framework's primary role is to test web applications, it can also be used for web scraping purposes.

What Are the Advantages of Playwright?

  • Through a single API, the library lets you use Chromium, Firefox, or WebKit for testing. Besides that, the cross-platform framework runs smoothly on Windows, Linux, macOS, locally or on CI.
  • Playwright supports Python, TypeScript, Java, JavaScript, and C#.
  • Playwright runs faster than most testing frameworks like Cypress.

What Are the Disadvantages of Playwright?

  • Playwright lacks support for Ruby, PHP, and Golang.
  • Instead of real devices, Playwright uses desktop browsers to emulate mobile devices.

Playwright Browser Options

Browser options and page methods control the testing environment.

  • Headless: This determines whether you see the browser during the testing. By default, the value is set to false. You can change it to true to see the browser during testing.
  • SlowMo: The slow movement reduces the speed of switching between actions on the page. For example, a 500 value denotes delaying the action by 500 milliseconds.
  • DevTools: You can open Chrome Dev Tools upon launching the target page. Note this option only works for Chromium.
Terminal
await playwright.chromium.launch({ devtools: true })

Playwright Page Object Methods

Here are some methods to control the launch page.

Object Methods Meaning
goto() Visit a page for the first time.
reload() Refresh the page.
evaluate() Execute JavaScript code within the web page's context and return the result to your Node.js environment. Interact with the DOM. Alternatively, you can use $eval(), $$eval(), $() and $$().
screenshot() Screenshot a page.
setDefaultTimeout() Make the headless browser wait for an action for a specified duration before throwing an error.
keyboard.press() Specify the key to press.
waitForSelector() Have the page delay action until a particular selector has been loaded.
locator() The locator class grabs elements using multiple selector combinations.
click() Specify the tag whose selector you wish to click.

Web Scraping With Playwright

As a quick tutorial to back up the Playwright vs. Puppeteer debate, let's use Playwright to scrape the product titles, prices, and image URLs from an e-commerce pagination demo site, Scraping Course JS Rendering, and save the results in a CSV file.

Start by importing the Playwright and filesystem modules.

scraper.js
const { chromium } = require("playwright");
const fs = require("fs");

Since Playwright runs in an asynchronous environment and the async-await syntax only runs in an asynchronous function, you need to create an asynchronous function and code your scraping logic inside it.

scraper.js
const { chromium } = require("playwright");
const fs = require("fs");


(async () => {
	// write your scraping logic here
})();

Let's write our scraping logic now!

Launch the Chromium browser and create a new context. Next, create a page object using the browser API's newPage() method.

scraper.js
// ...


(async () => {
  // launch a Chromium browser
  const browser = await chromium.launch();
  const context = await browser.newContext();


  // create a new page
  const page = await context.newPage();
})();

To scrape the elements in your script, first, you need to define your CSS selector strategy.

Open the target URL in your browser and inspect to open DevTools. You'll notice that all the products are enclosed within a product-grid div, and the product details (product-name, product-price, and product-image) are inside individual product-item divs.

We'll use this information in the next steps.

js rendering product-grid devtools screenshot
Click to open the image in full screen

To scrape the target page's product details, navigate to the URL and wait for the product grid to load.

scraper.js
// ...


(async () => {
  // ...


  // navigate to the target web page
  await page.goto("https://www.scrapingcourse.com/javascript-rendering", {
    waitUntil: "networkidle",
  });


  // wait for the product grid to load
  await page.waitForSelector("#product-grid .product-item", { timeout: 5000 });

})();

Using the CSS selectors, locate the div element containing products and store the resulting array of elements in the products variable.

scraper.js
// ...


(async () => {
  // ...


  // extract product details
  const products = await page.$$eval("#product-grid .product-item", (items) => {
    return items.map((item) => {
      let name = item.querySelector(".product-name").innerText.trim();
      let price = item.querySelector(".product-price").innerText.trim();
      let imageUrl = item.querySelector(".product-image").src;
      return { name, price, imageUrl };
    });
  });
})();

Finally, store the extracted data in a CSV file and close the browser.

Terminal
// ...


(async () => {
  // ...


  // specify the CSV headers
  const headers = ["name", "price", "imageUrl"];


  // add the headers to the CSV file
  let csvData = headers.join(",") + "\n";


  // create CSV-formatted strings
  products.forEach((product) => {
    const row = Object.values(product).join(",");
    csvData += row + "\n";
  });


  // write the extracted data to a CSV
  fs.writeFile("products.csv", csvData, (err) => {
    if (err) {
      console.error("Error writing CSV file:", err);
    } else {
      console.log("CSV file written successfully.");
    }
  });


  // close the browser
  await browser.close();
})();

Here's what the complete code looks like.

scraper.js
const { chromium } = require("playwright");
const fs = require("fs");


(async () => {
  // launch a Chromium browser
  const browser = await chromium.launch();
  const context = await browser.newContext();


  // create a new page
  const page = await context.newPage();


  // navigate to the target web page
  await page.goto("https://www.scrapingcourse.com/javascript-rendering", {
    waitUntil: "networkidle",
  });


  // wait for the product grid to load
  await page.waitForSelector("#product-grid .product-item", { timeout: 5000 });


  // extract product details
  const products = await page.$$eval("#product-grid .product-item", (items) => {
    return items.map((item) => {
      let name = item.querySelector(".product-name").innerText.trim();
      let price = item.querySelector(".product-price").innerText.trim();
      let imageUrl = item.querySelector(".product-image").src;
      return { name, price, imageUrl };
    });
  });


  // specify the CSV headers
  const headers = ["name", "price", "imageUrl"];


  // add the headers to the CSV file
  let csvData = headers.join(",") + "\n";


  // create CSV-formatted strings
  products.forEach((product) => {
    const row = Object.values(product).join(",");
    csvData += row + "\n";
  });


  // write the extracted data to a CSV
  fs.writeFile("products.csv", csvData, (err) => {
    if (err) {
      console.error("Error writing CSV file:", err);
    } else {
      console.log("CSV file written successfully.");
    }
  });


  // close the browser
  await browser.close();
})();

Run the script, and you'll get the following product data in your exported CSV file:

js rendering page products data exported csv
Click to open the image in full screen

And there you have it, a perfectly scraped web page using Playwright!

Puppeteer

Puppeteer is an automation library for JavaScript (Node.js). Unlike Playwright, it downloads and uses Chromium by default. It focuses more on Chrome DevTools, making it one of the go-to libraries for web scraping.

What Are the Advantages of Puppeteer?

Puppeteer makes it easy to get started with browser automation. It controls Chrome using non-standard DevTools protocol.

What Are the Disadvantages of Puppeteer?

  • Puppeteer supports only JavaScript (Node.js).
  • Puppeteer currently supports only Chromium (although the development for Firefox support is in progress).

Browser Options in Puppeteer

Most of Playwright's browser options, such as Headless, SlowMo, and DevTools, work in Puppeteer.

index.js
await puppeteer.launch({ headless: false, slowMo: 500, devtools: true })
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Page Object Methods in Puppeteer

Most Playwright's page object methods work in Puppeteer. Here are some of them.

Object Methods Meaning
goto() Visit a page for the first time.
goForward() Go forward.
goBack() Go back to the previous page.
reload() Refresh the page.
evaluate() Execute JavaScript code within the web page's context and return the result to your Node.js environment. Interact with the DOM. Alternatively, you can use $eval(), $$eval(), $() and $$().
screenshot() Screenshot a page.
setDefaultTimeout() or setDefaultNavigationTimeout() Have the headless browser wait for an action for a specified duration before throwing an error.
keyboard.press() Specify the key to press.
waitForSelector() Tell the page to delay action until a particular selector has been loaded.
waitFor() Delay subsequent actions.
locator() The locator class grabs elements using multiple selector combinations.
click() Specify the tag whose selector you wish to click.
select() Pick an option in the select element.

Web Scraping With Puppeteer

To scrape a web page using Puppeteer, import the Puppeteer module for web scraping and the fs module for saving the scraped data into a CSV file.

scraper.js
const puppeteer = require("puppeteer");
const fs = require("fs");

Create an asynchronous function to run the headless browser.

scraper.js
const puppeteer = require("puppeteer");
const fs = require("fs");


(async () => {
  // write your scraping logic here
})();

Now, launch the headless browser and create a new page.

scraper.js
// ...


(async () => {
  // launch a Chromium browser
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
})();

Using the goto() method, visit the target page and wait for the product grid to load before scraping the data.

scraper.js
// ...


(async () => {
  // ...


  // navigate to the target web page
  await page.goto("https://www.scrapingcourse.com/javascript-rendering", {
    waitUntil: "networkidle0", // Puppeteer uses 'networkidle0' instead of 'networkidle'
  });


  // wait for the product grid to load
  await page.waitForSelector("#product-grid .product-item", { timeout: 5000 });


})();

Extract the product title, price, and image URL before appending the data to the CSV file.

scraper.js
// ...


(async () => {
  // ...


  // extract product details
  const products = await page.$$eval("#product-grid .product-item", (items) => {
    return items.map((item) => {
      let name = item.querySelector(".product-name").innerText.trim();
      let price = item.querySelector(".product-price").innerText.trim();
      let imageUrl = item.querySelector(".product-image").src;
      return { name, price, imageUrl };
    });
  });


})();

Finally, export data to a CSV file and close the browser.

scraper.js
// ...


(async () => {
  // ...


  // specify the CSV headers
  const headers = ["name", "price", "imageUrl"];


  // add the headers to the CSV file
  let csvData = headers.join(",") + "\n";


  // create CSV-formatted strings
  products.forEach((product) => {
    const row = Object.values(product).join(",");
    csvData += row + "\n";
  });


  // write the extracted data to a CSV
  fs.writeFile("products.csv", csvData, (err) => {
    if (err) {
      console.error("Error writing CSV file:", err);
    } else {
      console.log("CSV file written successfully.");
    }
  });


  // close the browser
  await browser.close();
})();

Here's what the complete code looks like:

scraper.js
const puppeteer = require("puppeteer");
const fs = require("fs");


(async () => {
  // launch a Chromium browser
  const browser = await puppeteer.launch();
  const page = await browser.newPage();


  // navigate to the target web page
  await page.goto("https://www.scrapingcourse.com/javascript-rendering", {
    waitUntil: "networkidle0", // Puppeteer uses 'networkidle0' instead of 'networkidle'
  });


  // wait for the product grid to load
  await page.waitForSelector("#product-grid .product-item", { timeout: 5000 });


  // extract product details
  const products = await page.$$eval("#product-grid .product-item", (items) => {
    return items.map((item) => {
      let name = item.querySelector(".product-name").innerText.trim();
      let price = item.querySelector(".product-price").innerText.trim();
      let imageUrl = item.querySelector(".product-image").src;
      return { name, price, imageUrl };
    });
  });


  // specify the CSV headers
  const headers = ["name", "price", "imageUrl"];


  // add the headers to the CSV file
  let csvData = headers.join(",") + "\n";


  // create CSV-formatted strings
  products.forEach((product) => {
    const row = Object.values(product).join(",");
    csvData += row + "\n";
  });


  // write the extracted data to a CSV
  fs.writeFile("products.csv", csvData, (err) => {
    if (err) {
      console.error("Error writing CSV file:", err);
    } else {
      console.log("CSV file written successfully.");
    }
  });


  // close the browser
  await browser.close();
})();

Run the script, and you'll get the same product data in your exported CSV file:

js rendering page products data exported csv
Click to open the image in full screen

Congratulations, you have just scraped a web page using Puppeteer.

Playwright or Puppeteer: Which Is Faster?

Comparing Puppeteer vs. Playwright performance can get tricky, but let's find out which library comes out on top.

Let's create a third script file called performance.js and run the Playwright's and Puppeteer's code in it. Weโ€™ll time how long each function takes to scrape the ScrapingCourse JS Rendering demo page product's data.

performance.js
const playwrightPerformance = async () => { 
	// START THE TIMER 
	console.time('Playwright') 
	// Playwright scraping code 
	// END THE TIMER 
	console.timeEnd('Playwright') 
} 
 
const puppeteerPerformance = async () => { 
	// START THE TIMER 
	console.time('Puppeteer') 
	// Puppeteer scraping code 
	// END THE TIMER 
	console.timeEnd('Puppeteer') 
} 
 
playwrightPerformance() 
puppeteerPerformance()

Letโ€™s insert Playwright and Puppeteer scraping code in the respective functions, tune to headless browsing and then run the performance.js file five times to get the average runtime.

Here are the average durations per library:

  • Playwright โžก๏ธ (7.580 + 7.372 + 6.639 + 7.411 + 7.390) = (36.392 / 5) = 7.2784s
  • Puppeteer โžก๏ธ (6.656 + 6.653 + 6.856 + 6.592 + 6.839) = (33.596 / 5) = 6.7192s
Playwright vs. Puppeteer Performance
Click to open the image in full screen

And voilร , Puppeteer wins the Puppeteer vs. Playwright debate in terms of speed!

It's worth noting that these results are based on our own test. If you feel like running yours, go ahead and use the mini-guide shared above.

Is Playwright Better Than Puppeteer?

Overall, no comparison between Puppeteer and Playwright will give you a definitive answer as to which is the better option. It depends on multiple factors, such as long-term library support, cross-browser support, and your specific need for browser automation.

Here are some of the notable features of Playwright and Puppeteer:

Feature Playwright Puppeteer
Supported Languages Python, Java, JavaScript, TypeScript, and C# JavaScript
Supported Browsers Chromium, Firefox, and WebKit Chromium
Speed Fast Faster

Conclusion

As you can see, both Playwright and Puppeteer have their advantages, so you should consider the specifics of your scraping project and personal needs before making a decision.

However, the common problem with using both libraries for web scraping is that many websites detect headless browsers as bots, rendering your Playwright- or Puppeteer-based scraper useless.

ZenRows' web scraping API solves this problem perfectly. It can handle all anti-bot and CAPTCHA bypasses for you, and that's just a small portion of what it can do. Take advantage of the free trial and discover how it is to scrape fast and uninterrupted.

Ready to get started?

Up to 1,000 URLs for free are waiting for you