How to Bypass Akamai With Playwright

June 20, 2024 · 8 min read

Is your Playwright scraping script getting blocked by Akamai? No wonder. Akamai's strong security measures protect websites from unwanted bots and successfully block scrapers’ access.

In this guide, you'll explore the best strategies to bypass Akamai's defenses using Playwright.

Why Is Base Playwright Not Enough to Bypass Akamai?

While Playwright is a powerful tool for web scraping, more than its base version is needed to bypass Akamai's advanced security measures.

Playwright is an open-source framework that automates web interactions. It excels at rendering web pages, handling JavaScript, and managing cookies and sessions, so it's a popular choice for web scraping.

However, its basic version doesn't offer any methods to deal with Akamai's detection techniques. Akamai uses CAPTCHAs and recognizes suspicious IP addresses, non-human behavior patterns, as well as automated browsers' default settings.

For instance, let's try to access the Akamai-protected Similarweb comparison page:

Akamai-protected Similarweb comparison page
Click to open the image in full screen

The following Playwright script navigates to the Similarweb webpage and takes a screenshot:

script.py
# pip install playwright
# playwright install
import asyncio
from playwright.async_api import async_playwright
 
async def main():
    # launch the Playwright instance
    async with async_playwright() as p:

        # launch the browser
        browser = await p.chromium.launch()

        # create a new context 
        context = await browser.new_context()

        # create a new page within the context
        page = await context.new_page()

        # navigate to the desired URL
        await page.goto("https://www.similarweb.com/website/facebook.com/")

        # wait for any dynamic content to load
        await page.wait_for_load_state("networkidle")
    
        # take a screenshot of the page
        await page.screenshot(path="screenshot.png")
        print("Screenshot captured!")

        # close the browser
        await browser.close()
 
asyncio.run(main())
Access Denied
Click to open the image in full screen

Akamai denies access to your script:

Screenshot of the target page
Click to open the image in full screen

As you can see, simply using Playwright without additional strategies doesn't let you access the desired web page.

Fortunately, there are ways to overcome this hurdle. In the next section, we'll explore the three best strategies for scraping with Playwright without getting blocked.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Best Methods to Bypass Akamai With Playwright

Akamai's security system is advanced, so you need equally advanced solutions to avoid being flagged as a bot. You can achieve it by using a web scraping API, installing Playwright Stealth, or adding premium proxies.

Method 1: Go With Web Scraping API for a High Success Rate

Web scraping APIs are tools designed to overcome challenges such as CAPTCHAs, IP bans, and other anti-bot protections. Additionally, all of the bypassing happens “under the hood,” so they reduce the number of manual tasks required from you.

One of the leading web scraping APIs is ZenRows, a perfect solution for scraping any web pages, even those protected by Akamai. It acts as a headless browser and includes features like premium proxies, anti-CAPTCHAs, optimized headers, and more. These features allow you to completely replace Playwright with ZenRows, eliminating the need to deal with Playwright's technical setup and browser instance overhead.

Let's use ZenRows to scrape the same Akamai-protected Similarweb webpage that blocked you earlier.

Sign up for free, and you'll get redirected to the Request Builder page.

Paste the Similarweb comparison page URL in the URL to Scrape box. Enable JS Rendering and click on the Premium Proxies check box. Select Python as your language and click on the API tab.

building a scraper with zenrows
Click to open the image in full screen

The generated code uses Python's Requests library as the HTTP client. Make sure to install it using pip:

Terminal
pip install requests

Since we'll be capturing a screenshot of the target page, here is the modified version of the generated code:

script.py
# pip install requests
import requests

url = "https://www.similarweb.com/website/facebook.com/"
apikey = "<YOUR_ZENROWS_API_KEY>"

# define your request parameters
params = {
    "url": url,
    "apikey": apikey,
    "js_render": "true",
    "premium_proxy": "true",
    "screenshot": "true",
}


# get the response and take a screenshot
response = requests.get("https://api.zenrows.com/v1/", params=params)
with open("screenshot.png", "wb") as f:
    f.write(response.content)

The above code bypasses Akamai and takes a screenshot of the target page.

Screenshot of the target page
Click to open the image in full screen

Web scraping APIs are the most efficient and straightforward method of bypassing Akamai's security system. Let's explore the other manual solutions with Playwright.

Method 2: Use the Playwright Stealth Plugin

Bot detection systems can easily recognize properties unique to headless browsers, so using Playwright makes your scraper vulnerable:

  • Properties like navigator.webdriver are set to true, which can be detected by websites to identify headless browsers.
  • The value of navigator.hardwareConcurrency may not accurately represent the number of CPU cores available on typical user devices.
  • Properties like navigator.plugins and navigator.languages may contain unusual values not typically found in regular browsers.

The Playwright Stealth plugin modifies Playwright's default configurations to avoid plug leaks that flag you as a bot. The plugin's GitHub repository contains the comparison test results between Playwright and Playwright Stealth, which depict how the base Playwright fails the Browser Fingerprinting tests.

However, the plugin has some limitations. It's transplanted from the puppeteer-extra-plugin-stealth package, whose creators admit it can still be detected sometimes, so you can't rely on it when trying to bypass advanced anti-bot solutions like Akamai.

Check out our tutorial on using Playwright Stealth for scraping to learn more.

Method 3: Use Premium Proxies

A proxy acts as an intermediary between your web scraper and the target website. When you send a request through a proxy, the request appears to be coming from the proxy server rather than your IP address.

The most common proxy types include premium, datacenter, residential, mobile, and public proxies. Each type has unique advantages, and all can be integrated with Playwright.

However, only premium proxies provide features like auto IP rotation, high anonymity, and superior speed that help bypass Akamai's advanced techniques, such as IP reputation checks, browser fingerprinting, and automated traffic detection.

Check out our guide on using Playwright proxies for more information.

Conclusion

In this article, you've learned why using just base Playwright is not sufficient to bypass Akamai. While the Playwright Stealth plugin is only partially reliable, other methods, such as premium proxies and web scraping APIs, prove to be effective solutions.

To scrape undisturbed and avoid manual configurations, try ZenRows. This web scraping API handles all the necessary setups to bypass Akamai and other anti-bot measures. With ZenRows, you can focus on scraping the content you want without getting blocked. Try ZenRows for free!

Ready to get started?

Up to 1,000 URLs for free are waiting for you