Scale Web Scraping With Playwright BrowserContext

May 28, 2025 · 3 min read

Table of contents

Playwright BrowserContext
- Advantages of BrowserContext
- Limitations of BrowserContext
How to use Playwright's BrowserContext
Scrape at scale using ZenRows
Conclusion

When scaling your scraper, running multiple Playwright browser instances on a single node is resource-intensive, inefficient, and unsustainable. Fortunately, Playwright's BrowserContext helps you manage resources by enabling multiple sessions within a single browser instance.

We'll show you how to use Playwright's BrowserContext, explore its pros and cons, and provide tips for scaling horizontally across several nodes.

Playwright BrowserContext: Advantages and Limitations

Playwright's BrowserContext is a unique feature for scalable web scraping. Let's understand its pros and limitations.

Advantages of BrowserContext

Playwright's BrowserContext allows you to create isolated sessions within a single browser instance.
Each session can then run multiple pages, with all pages in the same context (session) sharing cookies, local storage, and other session data.
Contexts share a single browser resource but not the same session.
Their ability to share a browser process makes them memory efficient and faster to implement than full browsers.

Featured

Playwright Web Scraping Tutorial

Learn Playwright web scraping with this 2025 guide. Explore setup, advanced features, anti-bot tactics, and comparisons to Puppeteer and Selenium.

Limitations of BrowserContext

Since Playwright's contexts share the same browser process, a browser crash causes all contexts within that instance to fail.
Running too many contexts can increase memory consumption on the host machine. This is like opening thousands of tabs on your Chrome browser, which will most likely cause it to freeze.
BrowserContext isn't ideal when browser-level configurations like User Agent and fingerprint variations are required.
Contexts are more susceptible to anti-bot detection since they share the same browser process.

Frustrated that your web scrapers are blocked once and again?

ZenRows API handles rotating proxies and headless browsers for you.

Try for FREE

How to Use Playwright's BrowserContext

Let's see how to use Playwright's BrowserContext using the Ecommerce Challenge page as the target site.

Let's start with a single context that opens three pages.

First, import Playwright and start a new browser instance in sync mode. We've printed each page's title to test if the context works:

                    Example
                
# pip3 install playwright
# playwright install
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    # launch browser
    browser = p.chromium.launch()

    # create a browser context
    context = browser.new_context()

    # open a new tab (page) for each URL
    page1 = context.new_page()
    page1.goto("https://www.scrapingcourse.com/ecommerce/")
    print(page1.title())

    page2 = context.new_page()
    page2.goto("https://www.scrapingcourse.com/ecommerce/page/2/")
    print(page2.title())

    page3 = context.new_page()
    page3.goto("https://www.scrapingcourse.com/ecommerce/page/3/")
    print(page3.title())

    # close the browser context
    context.close()

    # Close browser
    browser.close()

  
  

  
Copied!

The above code outputs each page's title, as shown:

                    Output
                
Ecommerce Test Site to Learn Web Scraping - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 2 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 3 - ScrapingCourse.com

Copied!

When you run the browser instance in GUI mode with headless=False, you'll see that it opens a separate tab for each URL.

For scalability, you can share multiple URLs between several browser contexts. Let's see how it works with two instances:

First, create a function that creates a new context per URL batch:

                    Example
                
# ...

# create a new context per URL batch
def process_batch_urls_in_context(browser, url_batches):
    # create a new BrowserContext
    context = browser.new_context()
    for url_batch in url_batches:
        # create a new page for each URL
        page = context.new_page()

        # navigate to the URL
        page.goto(url_batch)
        print(page.title())
    # close the context
    context.close()

  
  

  
Copied!

Next, create a list of URLs and split it into two batches. Pass each batch into separate browser contexts using the previous batching function:

                    Example
                
# ...

with sync_playwright() as p:
    # Launch a single browser instance
    browser = p.chromium.launch()

    # list of URLs to scrape
    urls = [
        "https://www.scrapingcourse.com/ecommerce/",
        "https://www.scrapingcourse.com/ecommerce/page/2/",
        "https://www.scrapingcourse.com/ecommerce/page/3/",
        "https://www.scrapingcourse.com/ecommerce/page/4/",
        "https://www.scrapingcourse.com/ecommerce/page/5/",
    ]

    # split the URLs between two contexts
    urls_context1 = urls[: len(urls) // 2]
    urls_context2 = urls[len(urls) // 2 :]

    # process URLs in contexts1
    process_batch_urls_in_context(browser, urls_context1)

    # process URLs in contexts1
    process_batch_urls_in_context(browser, urls_context2)

    # close the browser
    browser.close()

  
  

  
Copied!

Here's the complete code:

                    Example
                
# pip3 install playwright
# playwright install
from playwright.sync_api import sync_playwright


# create a new context per URL batch
def process_batch_urls_in_context(browser, url_batches):
    # create a new BrowserContext
    context = browser.new_context()
    for url_batch in url_batches:
        # create a new page for each URL
        page = context.new_page()

        # navigate to the URL
        page.goto(url_batch)
        print(page.title())
    # close the context
    context.close()


with sync_playwright() as p:
    # Launch a single browser instance
    browser = p.chromium.launch(headless=False)

    # list of URLs to scrape
    urls = [
        "https://www.scrapingcourse.com/ecommerce/",
        "https://www.scrapingcourse.com/ecommerce/page/2/",
        "https://www.scrapingcourse.com/ecommerce/page/3/",
        "https://www.scrapingcourse.com/ecommerce/page/4/",
        "https://www.scrapingcourse.com/ecommerce/page/5/",
    ]

    # split the URLs between two contexts
    urls_context1 = urls[: len(urls) // 2]
    urls_context2 = urls[len(urls) // 2 :]

    # process URLs in contexts1
    process_batch_urls_in_context(browser, urls_context1)

    # process URLs in contexts1
    process_batch_urls_in_context(browser, urls_context2)

    # close the browser
    browser.close()

  
  

  
Copied!

The above code batches the URLs and distributes each between the two browser contexts. The page titles appear sequentially, showing that the batched jobs still don't run concurrently.

                    Output
                
Ecommerce Test Site to Learn Web Scraping - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 2 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 3 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 4 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 5 - ScrapingCourse.com

Copied!

Let's take the above code further with concurrency. Run Playwright in asynchronous mode and pack the batches using Python's asyncio:

                    Example
                
# pip3 install playwright
# playwright install
import asyncio
from playwright.async_api import async_playwright


# create a new context per URL batch
async def process_batch_urls_in_context(browser, url_batches):
    # create a new BrowserContext
    context = await browser.new_context()
    for url_batch in url_batches:
        # create a new page for each URL
        page = await context.new_page()

        # navigate to the URL
        await page.goto(url_batch)
        print(await page.title())
    # close the context
    await context.close()


# main function
async def scraper():
    async with async_playwright() as p:
        # launch a single browser instance
        browser = await p.chromium.launch()

        # list of URLs to scrape
        urls = [
            "https://www.scrapingcourse.com/ecommerce/",
            "https://www.scrapingcourse.com/ecommerce/page/2/",
            "https://www.scrapingcourse.com/ecommerce/page/3/",
            "https://www.scrapingcourse.com/ecommerce/page/4/",
            "https://www.scrapingcourse.com/ecommerce/page/5/",
        ]

        # split the URLs between two contexts
        urls_context1 = urls[: len(urls) // 2]
        urls_context2 = urls[len(urls) // 2 :]

        # run both contexts concurrently
        await asyncio.gather(
            process_batch_urls_in_context(browser, urls_context1),
            process_batch_urls_in_context(browser, urls_context2),
        )

        # close the browser
        await browser.close()


# run the async scraper function
asyncio.run(scraper())

  
  

  
Copied!

The above code outputs the page titles, as shown. The page numbers might be unordered this time since the batches now run concurrently:

                    Output
                
Ecommerce Test Site to Learn Web Scraping - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 3 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 2 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 4 - ScrapingCourse.com
Ecommerce Test Site to Learn Web Scraping - Page 5 - ScrapingCourse.com

Copied!

Nice! You just executed batched scraping jobs asynchronously with separate BrowserContexts. However, you'll still face resource constraints as these batches become memory-intensive at scale. You'll see a more scalable solution in the next section.

Scrape at Scale Using ZenRows' Scraping Browser

Managing multiple contexts on a single node becomes unsustainable over time, as they can heavily weigh down system memory. Splitting contexts across separate browser instances may seem like a solution, but it still results in memory-intensive processes running concurrently on the same machine and leads to similar performance bottlenecks.

The solution to all these limitations is to run your Playwright browser remotely on a cloud solution like the ZenRows Scraping Browser. This way, each instance manages its own process, reducing the chances of detection. The Scraping Browser also routes requests through rotating proxies and allows you to specify a geographic location, preventing IP bans and geo-restrictions.

Depending on your plan, the ZenRows Scraping Browser enables you to run between 20 and 150 concurrent browsers. It also distributes browser instances across several nodes. You can also manage multiple contexts across several instances on each node without impacting your local machine. This enables you to scale horizontally without infrastructural constraints or setup management.

Let's see how the ZenRows Scraping Browser works with Playwright using the same target website.

Sign up on ZenRows and go to the Scraping Browser Builder. Then, copy and paste the browser connection URL into your Playwright scraper.

Implement the cloud connection URL by connecting Playwright over the Chrome DevTools Protocol (CDP). Update the previous scraper like so:

                    Example
                
# pip3 install playwright
# playwright install
import asyncio
from playwright.async_api import async_playwright

# create a new context per URL batch
async def process_batch_urls_in_context(browser, url_batches):
    # create a new BrowserContext
    context = await browser.new_context()
    for url_batch in url_batches:
        # create a new page for each URL
        page = await context.new_page()

        # navigate to the URL
        await page.goto(url_batch)
        print(await page.title())
    # close the context
    await context.close()

connection_url = (
    "wss://browser.zenrows.com?apikey=<YOUR_ZENROWS_API_KEY>"
)


async def scraper():
    async with async_playwright() as p:
        # connect to the browser using CDP (Chrome DevTools Protocol)
        browser = await p.chromium.connect_over_cdp(connection_url)

        # list of URLs to scrape
        urls = [
            "https://www.scrapingcourse.com/ecommerce/",
            "https://www.scrapingcourse.com/ecommerce/page/2/",
            "https://www.scrapingcourse.com/ecommerce/page/3/",
            "https://www.scrapingcourse.com/ecommerce/page/4/",
            "https://www.scrapingcourse.com/ecommerce/page/5/",
        ]

        # split the URLs between two contexts
        urls_context1 = urls[: len(urls) // 2]
        urls_context2 = urls[len(urls) // 2 :]

        # run both contexts concurrently
        await asyncio.gather(
            process_batch_urls_in_context(browser, urls_context1),
            process_batch_urls_in_context(browser, urls_context2),
        )

        # close the browser
        await browser.close()

# run the async scraper function
asyncio.run(scraper())

  
  

  
Copied!

Congratulations! Your Playwright scraper is now set up for scalability with the ZenRows Scraping Browser. From here, you can pool several browser instances to scrape multiple pages concurrently in batches without impacting your local machine.

Conclusion

You've learned how to use Playwright's BrowserContext to scale your scraper, including its pros and cons. While contexts help you manage limited resources by sharing browser processes, scaling with them is unsustainable, especially when running a single node.

To scale efficiently without any pressure on your local machine, we recommend using the ZenRows Scraping Browser.

Try ZenRows for free!