Playwright vs Puppeteer in 2023: which should you choose

December 7, 2022 · 9 min read

The Playwright vs Puppeteer debate is a big discussion since both are fantastic Node.js libraries for browser automation. Although they do pretty much the same thing, there are some notable differences between Puppeteer and Playwright.

Let's run through a quick history here:

Puppeteer was created by the Chrome Dev team in 2017 to cover up for Selenium's unreliability for browser automation. Playwright was later launched by Microsoft and, similarly to Puppeteer, it's capable of running complex automation tests on a browser efficiently. Yet this time around, they introduced more tools into the testing environment.

So which one is the best?

Let's take a look at the differences between Puppeteer and Playwright to see what makes both libraries unique.

Playwright vs Puppeteer: what are the major differences?

Puppeteer and Playwright are both headless browsers originally designed for end-to-end automated testing of web apps. They were used for other purposes, such as web scraping. Although they have similar use cases, there are some key differences between both automation tools, which include:
  • Playwright supports Python, Golang, Java, JavaScript and C#, while Puppeteer supports only JavaScript, although there is a non-official port for Python.
  • Playwright supports three browsers: Chromium, Firefox or WebKit. And Puppeteer on the other hand, supports only Chromium.

Playwright

Playwright is an end-to-end web testing and automation library. Although the primary role of the framework is to test web applications, it's possible to use it for web scraping purposes.

What are the advantages of Playwright?

  • Through a single API, the library lets you use Chromium, Firefox or WebKit for testing. Besides that, the cross-platform framework runs fast in Windows, Linux and MacOS.
  • Playwright supports Python, Golang, Java, JavaScript and C#.
  • Playwright runs faster than most testing frameworks like Cypress.

What are the disadvantages of Playwright?

  • Playwright lacks support for Ruby and Java.
  • Instead of real devices, Playwright uses desktop browsers to emulate mobile devices.

Playwright Browser Options

Browser options and page methods control the testing environment.
  • Headless: this determines whether you see the browser during the testing. By default, the value is set to false. You can change it to true to see the browser during testing.
  • SlowMo: The slow movement reduces the speed of switching between actions on the page. For example, a 500 value denotes delaying the action by 500 milliseconds.
  • DevTools: Open Chrome Dev Tools on launching the target page. Note this option works for Chromium only.
await playwright.chromium.launch({ devtools: true })

Playwright Page Object Methods

Here are some methods to control the launch page.

Object Methods Meaning
goto() Visit a page for the first time
reload() This method refreshes the page
evaluate() This method gives you a mini API to grab an element and manipulate it with JavaScript for the DOM in a Node.js environment.
Alternatively, you can use $eval(), $$eval(), $() and $$().
screenshot() To screenshot a page.
setDefaultTimeout() It lets the headless browser wait for an action for a specified duration before throwing an error.
keyboard.press() This method lets you specify the key to press.
waitForSelector() It tells the page to delay action until a particular selector has been loaded.
locator() The locator class grabs elements using multiple selector combinations.
click() This method lets you specify the tag whose selector you wish to click.
Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Web Scraping with Playwright

As a quick tutorial to back up the Playwright vs Puppeteer debate, let's use Playwright to scrape the product titles, prices and image URLs from the Vue Storefront and save the results in a CSV file.

Start by importing the Playwright module as well as the filesystem (fs) module to save the scraped data in a CSV file.

import playwright from 'playwright' // web scraping 
import fs from 'fs' // saving data to CSV

Remember to specify the module type in the package.json file, otherwise the import syntax won't work.

Module type
Click to open the image in fullscreen

Since Playwright runs in an asynchronous environment and the async-await syntax only runs in an asynchronous function, you can create an asynchronous main function and code the scraper in it.

const main = async () => { 
	// write some code 
} 
main()

Next step is to launch a Browser and create a new page, let's go ahead and launch Chromium in a headed mode.

const browser = await playwright.chromium.launch({ headless: false })

You've opened the browser, congratulations but we are halfway there. Create a page object using the browser API's newPage() method.

const page = await browser.newPage()

To scrape Vue Storefront's product details, let's visit the "kitchen" category page and sort the items by "Newest".

await page.goto('https://demo.vuestorefront.io/c/kitchen?sort=NEWEST')

Alternatively you can automate the scraper to locate and click each element at a time till you get to the target page.

Inspect the menu ID
Click to open the image in fullscreen

Let's create a CSV file and write its headings to scrap the title, price and image URLs.

fs.writeFileSync('products.csv', 'title,price,imageUrl\n')

You can locate the div element containing products using the CSS class selector and store the resulting array of elements in the products variable.

 const products = await page.$$('.products__grid > .sf-product-card.products__product-card')

Using the for-of loop, extract the title, price, and image URL from each (product) child element as shown below:

for (const product of products) { 
	let title, price, imageUrl 
	// extracting the target portions into title, price and image urls, respectively 
	title = await page.evaluate(e => e.querySelector('.sf-product-card__title').textContent.trim(), product) 
	price = await page.evaluate(e => e.querySelector('.sf-price__regular').textContent.trim(), product) 
	imageUrl = await page.evaluate(e => e.querySelector('.sf-image.sf-image-loaded').src, product) 
	// for every loop, append the extracted data into the CSV file 
	fs.appendFile('products.csv', `${title},${price},${imageUrl}\n`, e => { if (e) console.log(e) }) 
}

Close the browser and run the script file.

await browser.close()

And boom! There you have it, a perfectly scraped webpage using Playwright.

Scraping output
Click to open the image in fullscreen

In case you got lost along the line, here's what the complete code looks like:

// in index.js 
 
// Import the modules: playwright (web scraping) and fs (saving data to CSV) 
import playwright from 'playwright' 
import fs from 'fs' 
 
// create asynchronous main function 
const main = async () => { 
	// launch a visible chromium browser 
	const browser = await playwright.chromium.launch({ headless: false }) 
 
	// create a new page object 
	const page = await browser.newPage() 
	// visit the target page 
	await page.goto('https://demo.vuestorefront.io/c/kitchen?sort=NEWEST') 
	// create a CSV file, in readiness to save the data we are about to scrape 
	fs.writeFileSync('products.csv', 'title,price,imageUrl\n') 
 
	// download an array of divs containing the target data 
	const products = await page.$$('.products__grid > .sf-product-card.products__product-card') 
	// loop through the array, 
	for (const product of products) { 
		let title, price, imageUrl 
		// dissecting the target portions into title, price and image urls, respectively 
		title = await page.evaluate(e => e.querySelector('.sf-product-card__title').textContent.trim(), product) 
		price = await page.evaluate(e => e.querySelector('.sf-price__regular').textContent.trim(), product) 
		imageUrl = await page.evaluate(e => e.querySelector('.sf-image.sf-image-loaded').src, product) 
		// for every loop, append the dissected data into the already created CSV file 
		fs.appendFile('products.csv', `${title},${price},${imageUrl}\n`, e => { if (e) console.log(e) }) 
	} 
	// Close the (running headless) browser when the mission is accomplished 
	await browser.close() 
} 
 
// don't forget to run the main() function 
main()

Puppeteer

Puppeteer is an automation library for JavaScript (Node.js) and, unlike Playwright, Puppeteer downloads and uses Chromium by default. It focuses more on the Chrome DevTools, making it one of the go-to libraries for web scraping.

What are the advantages of Puppeteer?

Puppeteer simplifies getting started with browser automation. It controls Chrome using non-standard DevTools protocol.

What are the disadvantages of Puppeteer?

  • Puppeteer supports only JavaScript (Node.js).
  • Although the development for Firefox support is in progress, Puppeteer currently supports only Chromium.

Browser Options in Puppeteer

Most Playwright's browser options like headless, slowMo and devtools work in Puppeteer.

await puppeteer.launch({ headless: false, slowMo: 500, devtools: true })

Page Object Methods in Puppeteer

Similarly most Playwright's page object methods work in Puppeteer. Here are some of them.

Object Methods Meaning
goto() Visit a page for the first time
goForward() To go forward
goBack() To go back to the previous page
reload() This method refreshes the page
evaluate() This method gives you a mini API to grab an element and manipulate it with JavaScript for the DOM in a Node.js environment.
Alternatively, you can use $eval(), $$eval(), $() and $$().
screenshot() To screenshot a page.
setDefaultTimeout() or setDefaultNavigationTimeout() It lets the headless browser wait for an action for a specified duration before throwing an error.
keyboard.press() This method lets you specify the key to press.
waitForSelector() It tells the page to delay action until a particular selector has been loaded.
waitFor() To delay subsequent actions.
locator() The locator class grabs elements using multiple selector combinations.
click() This method lets you specify the tag whose selector you wish to click.
select() To pick an option in the select element.

Web Scraping with Puppeteer

To scrape a webpage using Puppeteer, import the Puppeteer module for web scraping and the fs module for saving the scraped data into a CSV file.

import puppeteer from 'puppeteer' // web scraping 
import fs from 'fs' // saving scraped data 

Create an asynchronous function to run the headless browser.

const main = async () => { 
	// write some code 
} 
main()

Now launch the headless browser and create a new page.

const browser = await puppeteer.launch({ headless: false }) 
const page = await browser.newPage()

Using the goto() method, visit the target page before scraping the data.

await page.goto('https://demo.vuestorefront.io/c/kitchen?sort=NEWEST')

Next, create a CSV file to store the scraped data.

fs.writeFileSync('products.csv', 'title,price,imageUrl\n')

Locate and crawl the data from the web page.

const products = await page.$$('.products__grid > .sf-product-card.products__product-card')

Using the for-of loop, extract the title, price and image URL of the products before appending the data to the CSV file.

for (const product of products) { 
	let title, price, imageUrl 
	// extracting the target portions into title, price and image urls, respectively 
	title = await page.evaluate( e => e.querySelector('.sf-product-card__title').textContent.trim(), product) 
	price = await page.evaluate( e => e.querySelector('.sf-price__regular').textContent.trim(), product) 
	imageUrl = await page.evaluate( e => e.querySelector('.sf-image.sf-image-loaded').src, product) 
	// for every loop, append the extracted data into the CSV file 
	fs.appendFile('products.csv', `${title},${price},${imageUrl}\n`, e => { if (e) console.log(e) }) 
}

Lastly, close the browser and run the script.

await browser.close()

Congratulations, you have just scraped a web page using Puppeteer. 😀

Web scraping with Puppeteer
Click to open the image in fullscreen

Here's what the full code looks like:

// Import the modules: puppeteer (web scraping) and fs (saving data to CSV) 
import puppeteer from 'puppeteer' 
import fs from 'fs' 
 
// create asynchronous main function 
const main = async () => { 
	// launch a headed chromium browser 
	const browser = await puppeteer.launch({ headless: false }) 
 
	// create a new page object 
	const page = await browser.newPage() 
	// visit the target page 
	await page.goto('https://demo.vuestorefront.io/c/kitchen?sort=NEWEST') 
	// create a CSV file, in readiness to save the data we are about to scrape 
	fs.writeFileSync('products.csv', 'title,price,imageUrl\n') 
 
	// download an array of divs containing the target data 
	const products = await page.$$('.products__grid > .sf-product-card.products__product-card') 
	// loop through the array, 
	for (const product of products) { 
		let title, price, imageUrl 
		// dissecting the target portions into title, price and image urls, respectively 
		title = await page.evaluate( e => e.querySelector('.sf-product-card__title').textContent.trim(), product) 
		price = await page.evaluate( e => e.querySelector('.sf-price__regular').textContent.trim(), product) 
		imageUrl = await page.evaluate( e => e.querySelector('.sf-image.sf-image-loaded').src, product) 
		// for every loop, append the dissected data into the already created CSV file 
		fs.appendFile('products.csv', `${title},${price},${imageUrl}\n`, e => { if (e) console.log(e) }) 
	} 
	// Close the (running headless) browser when the mission is accomplished 
	await browser.close() 
} 
 
// don't forget to run the main() function 
main()

Playwright or Puppeteer: which is faster?

Comparing Puppeteer vs Playwright performance can get tricky, but let's find out which library comes out on top.

Let's create a third script file called performance.js and run the Playwright's and Puppeteer's code in it while timing how long each function takes to scrape the Vue Storefront's data.

// in performance.js 
 
const playwrightPerformance = async () => { 
	// START THE TIMER 
	console.time('Playwright') 
	// Playwright scraping code 
	// END THE TIMER 
	console.timeEnd('Playwright') 
} 
 
const puppeteerPerformance = async () => { 
	// START THE TIMER 
	console.time('Puppeteer') 
	// Puppeteer scraping code 
	// END THE TIMER 
	console.timeEnd('Puppeteer') 
} 
 
playwrightPerformance() 
puppeteerPerformance()

We'll insert Playwright and Puppeteer scraping code in the respective functions, tune to headless browsing and then run the performance.js file five times to get the average runtime.

Here are the average durations per library:
  • Playwright ➡️ (7.580 + 7.372 + 6.639 + 7.411 + 7.390) = (36.392 / 5) = 7.2784s
  • Puppeteer ➡️ (6.656 + 6.653 + 6.856 + 6.592 + 6.839) = (33.596 / 5) = 6.7192s
Performance results
Click to open the image in fullscreen

And voilà, Puppeteer wins the Puppeteer vs Playwright debate in terms of speed!

It's worth noting that these results are based on our own test. If you feel like running yours, go ahead and use the mini-guide shared above.

Is Playwright better than Puppeteer?

There isn't a direct answer to which is the better option between Puppeteer vs Playwright since it depends on multiple factors like long-term library support, cross-browser support and your specific need for browser automation.

Here are some of the notable features of Playwright and Puppeteer:

Feature Playwright Puppeteer
Supported Languages Python, Java, JavaScript and C# JavaScript
Supported Browsers Chromium, Firefox and WebKit Chromium
Speed fast faster

A common problem faced by web scraping is that some websites detect bots and will block your headless browsing, especially when you click on buttons and send multiple traffic quickly. One solution is to introduce timers before subsequent actions.

For example, you can program Puppeteer to mimic a (human) user by waiting for 0.1s before clicking a button after typing details into a login form. Yet the downside of multiple timers is that they slow down your browsing and most websites can even detect them.

ZenRows API solves this problem perfectly, it's capable of handling all anti-bot and CAPTCHA bypass for you, and that's just a small portion of what it's capable of. Take advantage of the free trial available to find out why it's a holy grail for web scraping.

Did you find the content helpful? Spread the word and share it on Twitter, LinkedIn, or Facebook.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Want to keep learning?

We will be sharing all the insights we have learned through the years in the following blog posts. If you don't want to miss a piece and keep learning, we'd be thrilled to have us in our newsletter.

No spam guaranteed. You can unsubscribe at any time.