TLS (Transport Layer Security) fingerprinting is one of the most challenging anti-bot detection techniques. However, the burp-awesome-tls extension, when used with Burp Suite, can help you bypass it, allowing you to scrape without getting blocked.
In this article, you'll learn how to pair Burp Suite with the burp-awesome-tls extension to spoof a regular browser's TLS fingerprint. You'll also see how to make a scraping request with the setup.
- Step 1: Set Up Burp Suite and burp-awesome-tls.
- Step 2: Obtain Burp Suite proxy address.
- Step 3: Activate the burp-awesome-tls extension.
- Step 4: Test the extension.
- Step 5: Send scraping requests through Burp Suite.
Why TLS Fingerprinting Is a Threat to Web Scraping
TLS fingerprinting is a method used to uniquely identify clients (browsers or HTTP clients) based on the characteristics of their TLS handshake. During this process, the client sends a "client hello" message to the server detailing connection metadata such as the supported TLS version, cipher suites, extensions, and other parameters.
Anti-bot systems, including Cloudflare, DataDome, Imperva Incapsula, and Akamai, use this fingerprint to distinguish between automated and legitimate traffic.Â
So, TLS fingerprinting poses a significant challenge for web scrapers, as they usually have mismatched or outdated fingerprints, which can reveal automation activities and trigger blocking mechanisms. For instance, using outdated TLS versions, such as 1.0 or 1.1, instead of modern versions, like 1.2 or 1.3, deviates from the typical behavior of modern browsers and raises suspicion.
What Is Burp Suite?
Burp Suite is a web security testing application designed for simulating penetration attacks for web applications. It features various tools, including proxy, crawler, browser, request interceptor, request forwarder, and more.
When you open a website through Burp Suite, it routes your request through its built-in proxy server, allowing you to gain temporary control of the traffic. Notably, the interception feature (when active) lets you intercept the Fetch/XHR request, modify it on the fly, and send it through the Burp Suite proxy using the forward feature.
You can route regular HTTP clients like Python's Requests through Burp's proxy and leverage the interception feature in real-time. You can also extract site maps using Burp's passive crawler, which is valuable for web scraping. The Burp Suite supports external extensions like the burp-awesome-tls to modify its functionalities. You'll see how that extension works in the next section.
What Is burp-awesome-tls and How Does It Work?
The burp-awesome-tls is a Java extension for Burp Suite that hijacks the Burp Suite HTTP and TLS APIs, allowing you to spoof a real browser's TLS fingerprints. It aims to help you bypass anti-bot measures during automation activities like web scraping.
However, anti-bots like Akamai, Cloudflare, PerimeterX, etc., use detection techniques beyond TLS fingerprinting. Therefore, the burp-awesome-tls extension doesn't guarantee a successful anti-bot bypass.
That said, burp-awesome-tls increases your chances against milder anti-bot measures that rely solely on TLS fingerprinting to identify and block bots. Plus, it provides insight into how to spoof other fingerprints via a tool like Burp Suite to avoid getting blocked while scraping.
Bypass TLS Fingerprinting With Burp Suite and burp-awesome-tls During Scraping
In this section, you'll follow a step-by-step guide to set up Burp Suite and the burp-awesome-tls extension. We'll first compare a few TLS behaviors on BrowserLeaks before scraping the Anti-bot challenge page with Python's Requests to see how powerful the TLS-spoofer extension is.
Step 1: Set Up Burp Suite and burp-awesome-tls
Download and install a system-compatible Community Edition of Burp Suite from the official download site. Then, follow the on-screen installation instructions to get Burp Suite on your computer.
Next, download the latest burp-awesome-tls extension from the release page to a suitable directory on your machine. Make sure to download the ...fat.jar
file for cross-platform compatibility.
Are both tools ready? If so, let's start the Burp Suite engine!
Step 2: Obtain Burp Suite Proxy Address
The Burp Suite proxy is an essential part of the connection. Although Burp's proxy is active by default, you should verify it to ensure that all requests routed through Burp Suite utilize its built-in proxy server.
Burp Suite's proxy address defaults to:
http://127.0.0.1:8080
Launch the Burp Suite application and click Next. Then, click Start Burp to open the configuration interface.
To check the proxy connection, click Proxy in the top menu bar and go to Proxy settings. You'll see a table under Proxy listeners with a checked proxy address, indicating that Burp's default proxy is active. You can add or remove proxies from this table, but we'll use the default for this tutorial.
You've confirmed that Burp's proxy is active. Now, let's load the burp-awesome-tls extension.
Step 3: Activate the burp-awesome-tls Extension
The first step in activating the burp-awesome-tls extension is to load it into Burp Suite.
- Go to Extensions on the menu bar and click Add.Â
- Under Extension details, set the Extension type as Java.Â
- Click the Select file button to browse the burp-awesome-tls
.jar
file you downloaded earlier. - After selecting the extension from the download location, click Next and close the modal box.
The extension should now be active with a checkmark. Its name (Awesome TLS) will also appear on the menu bar, showing you've loaded the extension successfully into Burp Suite:
To deactivate the Awesome TLS extension, uncheck the activation checkbox and select Yes from the confirmation dialogue box.
You can click the extension name at the top to configure additional parameters, such as TLS handshake timeout, browser fingerprint choice, custom client hello message, and more. For instance, if you want to use a Firefox TLS fingerprint, you can select Firefox 105 under the Fingerprint option. Your request will only use the chosen browser's TLS fingerprint (not its User-Agent).
However, we'll stick to the burp-awesome-tls extension default settings, which use the latest available browser's fingerprint. So you don't need to do anything extra at this point:
Next, you'll test the extension using Burp's browser feature.
Step 4: Test the Extension
Let's first observe the difference in TLS behavior before and after activating the extension by sending a request to BrowserLeaks via Burp's built-in browser.Â
To do that, deactivate burp-awesome-tls by unchecking it in the Extensions menu. This action will allow you to run Burp Suite in plain mode (without the extension).
Go to Proxy and click Open Browser to launch the Burp browser. Then, load BrowserLeaks via the browser.
The result notably shows that Burp's default request (without the extension) uses TLS versions 1.0 and 1.1. These are outdated and can result in blocking:
However, reactivating the extension and refreshing the browser removes TLS 1.0 and 1.1, demonstrating that the burp-awesome-tls extension has successfully spoofed a modern TLS fingerprint.
The Burp browser might return an error "400 Bad Request" after reactivating the burp-awesome-tls extension and refreshing the web page. If this happens, restart the Burp Suite application and relaunch the browser to resolve this issue.
Awesome! The extension works.
You're now ready to send web scraping requests with the modified TLS fingerprint.
Step 5: Send Scraping Requests Through Burp Suite
As mentioned, you can route your scraping requests through Burp Suite using its proxy server. You'll use Python's Requests library to try to scrape the full-page HTML of the Antibot Challenge page.
Using Burp's built-in proxy server in your Python request doesn't mask or change your original IP address. It only acts as a man-in-the-middle (MITM) proxy, allowing Burp Suite to intercept your request before forwarding it to the target server.
Open your preferred IDE to set up a Python project, create a virtual environment, and install Requests using pip
:
pip3 install requests
You must reactivate the Awesome TLS extension if you've deactivated it earlier.Â
Now, send your scraping request using Burp's default proxy server address. This action lets you use Burp's features in your scraper, including the burp-awesome-tls extension. Ensure you set verify=False
to use Burp's Certificate Authority (CA):
# pip3 install requests
import requests
# define the Burp proxy server address
proxies = {
"http": "http://127.0.0.1:8080",
"https": "http://127.0.0.1:8080",
}
# set the target URL
url = "https://www.scrapingcourse.com/antibot-challenge"
# send a GET request via Burp's proxy
try:
# deactivate SSL verification to use Burp's certificate
response = requests.get(
url,
proxies=proxies,
verify=False,
)
print(response.text)
except requests.RequestException as e:
print(f"Request error: {e}")
Unfortunately, the request couldn't bypass the anti-bot detection measure. Here's the result:
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Just a moment...</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<!-- ... -->
</head>
<body>
<!-- ... -->
</body>
</html>
To verify your Python request went through Burp Suite, open the application and go to HTTP history under the Proxy tab. Here's what three consecutive Python requests to the target website look like on Burp:
The above request failed despite spoofing a real browser's TLS fingerprint with the burp-awesome-tls extension. This means the anti-bot uses other underlying detection mechanisms that TLS spoofing doesn't handle.
For instance, you can still get detected via incomplete request headers since the Awesome TLS extension only focuses on TLS spoofing and doesn't deal with request header spoofing. Other bot detection mechanisms that can block your request include browser fingerprinting, JavaScript challenges, static user behavior, rate limiting, etc.
As mentioned earlier, TLS fingerprint spoofing may work against less sophisticated anti-bot mechanisms but not against advanced ones. You'll see how to bypass these detections effectively in the next section.
The Best Scraping Technique to Avoid Detection
One way to reduce the chance of detection is by distributing traffic over web scraping proxies and using request headers. However, these methods are insufficient, as anti-bots can detect you through other advanced techniques.
To bypass anti-bots effectively during scraping, you need to deal with every detection method and set your request to mimic a human user, which can become unmanageable at scale. Thankfully, there's one surefire way to handle all these hurdles and scrape without getting blocked.
The ultimate solution is the ZenRows Scraper API, an all-in-one web scraping solution featuring all the tools required to scrape any website at scale without limitations. The ZenRows Scraper API features advanced fingerprint spoofing, request header management, premium proxy rotation, JavaScript rendering, anti-bot auto-bypass, and more.
All you need is a single API call with any programming language, and ZenRows will handle all fingerprinting issues and anti-bot bypass tasks under the hood so you can focus on your scraping logic.
Let's see how it works by requesting the previous Antibot challenge page that blocked your request.
Sign up for free to open the Request Builder. Paste the target URL in the link box, and activate Premium Proxies and JS Rendering.
Select your programming language (Python, in this case) and choose the API connection mode. Then, copy and paste the generated code into your scraper file.
Here's what the generated Python code looks like:
# pip install requests
import requests
url = "https://www.scrapingcourse.com/antibot-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
"url": url,
"apikey": apikey,
"js_render": "true",
"premium_proxy": "true",
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)
The above code outputs the protected site's full-page HTML, showing that it bypassed the anti-bot measure:
<html lang="en">
<head>
<!-- ... -->
<title>Antibot Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Antibot challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Congratulations 🎉! You bypassed anti-bot protection using ZenRows.Â
Conclusion
You've now learned to spoof TLS fingerprinting using the burp-awesome-tls extension and Burp Suite. You've also seen how to send a spoofed Python scraping request through Burp's proxy to bypass TLS fingerprinting.
However, while the Awesome TLS extension provides a solid footing for bypassing fingerprinting, it only focuses on TLS fingerprinting detection. There are other detection techniques you need to handle for successful scraping. We recommend using ZenRows to evade all anti-bot detection techniques without hassle and can scrape at scale without getting blocked.
Try ZenRows today without a credit card!