How to Solve Akamai 403 Forbidden Error

March 31, 2024 · 2 min read

Table of contents

What is it?
How to bypass it
- 1. Use a web scraping API
- 2. Use proxies
- 3. Use a headless browser
- 4. Optimize your headers
Conclusion

What Is Error 403 Forbidden from Akamai?

If you've ever encountered the Akamai 403 Forbidden error when web scraping, you're not alone.

The error mainly occurs when Akamai (a pretty common anti-bot solution that most websites have) flags your web scraper.

So, what you really want to do in this scenario is bypass Akamai.

How to Bypass Akamai 403 Forbidden Error

Below are four actionable methods and how to implement them.

1. Use a Web Scraping API

The most effective solution to bypass the Akamai 403 Forbidden error is to use a web scraping API like ZenRows. This tool handles all the complexity of bypassing anti-bot solutions under the hood, allowing you to focus on extracting the desired data.

Its features enable you to scrape without getting blocked, including premium proxies, headless browser functionality, CAPTCHA bypass, or user agent rotation. ZenRows supports any programming language and is easy to use.

Let's test ZenRows against an Akamai-protected web page.

building a scraper with zenrows — Click to open the image in full screen

Input the target URL (in this case, https://www.kickz.com/de), activate JavaScript Rendering, and check the box for Premium Proxies.

That'll generate your request code on the right. Copy it, and use your preferred HTTP client. For example, Python Requests, which you can install using the following command

                    Terminal
                
pip install requests

Copied!

Your code should look like this:

                    scraper.py
                
import requests
 
url = 'https://www.kickz.com/de'
apikey = '<YOUR_ZENROWS_API_KEY>'
params = {
    'url': url,
    'apikey': apikey,
    'js_render': 'true',
    'premium_proxy': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)

  
  

  
Copied!

Run it, and you'll get the HTML of the page.

                    Terminal
                
#..
 
<title>
    KICKZ.COM. Der Online Shop für Streetwear, Sneaker und Basketball Gear
</title>
 
#..

Copied!

Awesome, right? ZenRows easily bypasses the Akamai 403 error in Python or any other language.

2. Get Premium Proxies

Proxies can play a pivotal role in navigating through Akamai's defenses. They act as intermediary servers that allow you to route your requests through different IP addresses, masking your identity and location. This way, you can avoid direct IP bans, which are sometimes a consequence of the Akamai 403 error.

However, it's worth noting that free proxies are unstable and don't work in real-world use cases. So, you must use premium proxies for the best results. They offer reliability and consistent performance, essential when dealing with Akamai. Additionally, rotating proxies is also vital to avoid rate limits and getting blocked.

To learn more about proxy types and solutions, check out this guide on the 10 best web scraping proxy services.

3. Scrape with a Headless Browser and Anti-Bot Plugin

In most cases, you must emulate natural browsing behavior to bypass Akamai protection. You can achieve that through headless browsers like Selenium, Puppeteer, or Playwright.

These tools simulate the behavior of a regular browser but without a graphical user interface. They also render JavaScript like an actual browser and should, therefore, be able to bypass JavaScript-based challenges commonly associated with anti-bot systems.

But that's not always the case, as these headless browsers possess automation indicators that make them easily detectable by anti-bot systems. Thus, integrating anti-bot plugins is crucial to boost your chances of success.

For Selenium, the Undetected Chromedriver plugin patches most property leaks that anti-bot systems can use to detect your scraper. Check out Undetected Chromedriver in Python for a step-by-step tutorial on how to utilize this plugin.

Similarly, Puppeteer users can leverage the Puppeteer Stealth plugin to fortify their scraper. This extension employs various evasion modules to mask Puppeteer's automation properties, making it difficult for websites to detect. To learn more, check out this guide on how to use Puppeteer Stealth.

4. Optimize Your Headers

Your HTTP headers can significantly impact your project's success, especially when dealing with sophisticated anti-bot systems like Akamai. Just like with Cloudflare's 403 error, while HTTP clients and headless browsers provide default headers, these are often incomplete and lack many header strings generated by actual web browsers.

This discrepancy in header string makes it easy for websites to detect your web scraper.

Let's consider a comparison between basic headers from a headless browser, such as Puppeteer:

                    Terminal
                
Request Headers: {
  'upgrade-insecure-requests': '1',
  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/119.0.0.0 Safari/537.36',
  'sec-ch-ua': '"Chromium";v="119", "Not?A_Brand";v="24"',
  'sec-ch-ua-mobile': '?0',
  'sec-ch-ua-platform': '"Windows"'
}

  
  

  
Copied!

And headers from a typical web browser like Chrome.

                    Output
                
{
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'accept-language': 'en-US,en;q=0.9',
    'cache-control': 'max-age=0',
    'cookie': 'prov=4568ad3a-2c02-1686-b062-b26204fd5a6a; usr=p=%5b10%7c15%5d%5b160%7c%3bNewest%3b%5d',
    'referer': 'https://www.google.com/',
    'sec-ch-ua': '"Not.A/Brand";v="8", "Chromium";v="114", "Google Chrome";v="114"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'cross-site',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}

  
  

  
Copied!

The examples above show that Puppeteer lacks certain header strings, such as user-agent, accept-encoding preferences, cookie, referer, and accept-language settings. These omissions can raise red flags to anti-bot systems like Akamai.

Therefore, it's crucial to craft proper and complete headers. In doing this, ensure you have a well-formed User Agent, your HTTP request's most critical header component. It's also crucial to ensure consistency among all header strings. For example, if you're using a Chrome User Agent, it must correspond with other header strings.

Additionally, while the order of request headers generally doesn't matter, non-browser clients like Puppeteer have default headers in specific arrangements. Maintaining such orders can result in getting flagged.

Check out this guide on web scraping headers to learn how to optimize your HTTP headers fully.

Conclusion

The Akamai 403 error presents a formidable challenge for any web scraping project. Although techniques like premium proxies, headless browsers, and optimized headers have advantages, it's crucial to recognize that a web scraping API provides the easiest route to guaranteed results.

For more insights, check out this 403 web scraping guide. And if you're ready to streamline your web scraping efforts, Sign up now to try ZenRows for free.