What Is TLS Fingerprint and How to Bypass It

November 24, 2022 Β· 6 min read

TLS fingerprint analysis is one of the anti-bot detection solutions that websites use to protect against malicious attacks. Using this method, web servers are able to identify which web client is trying to initiate a conversation and then decide whether to block or allow the request.

While their target isn't ethical data extraction tools, your web scrapers might still get blocked anyways.

In this article, we'll be guiding you through how to bypass TLS fingerprinting for ethical web scraping purposes by discussing what it is, how it works and the different approaches to working around it.

Let's dive in!

What is TLS Fingerprinting?

TLS fingerprinting is a popular server-side fingerprinting technique. To a high degree of accuracy, it enables web servers to determine a web client's identity, using only the parameters in the first packet connection before any application data exchange occurs. Web clients, in this case refer to the applications initiating a request, which can be browsers, CLI tools, scripts (bots), and so on. Solutions like Cloudflare use TLS fingerprint to identify and blog malicious attacks.

A different type of fingerprinting is client-side fingerprinting, which involves testing the client using JavaScript. This may be a discussion for another day, but for this article, our focus is on server-side fingerprinting.

TLS stands for "Transport Layer Security", It's an encryption protocol designed for securing connections between web clients and servers. While it's often used interchangeably with SSL, TLS evolved from SSL and is now the most widely used web communication security protocol today.

When a web scraper sends a request to an HTTPS website, it does so over TLS security. While that wouldn't particularly mean anything to web scrapers, websites with TLS fingerprinting in their network not only identifies you as a malicious bot, they deny you access completely.

How does TLS Fingerprint work?

For a web client to communicate with a web server over a secure channel, both parties must agree on the encryption algorithm and cryptographic keys of that conversation. This agreement is reached through TLS handshake: the entire sequence where the client and server exchange important information required to establish a secure connection.

Typically the client's first approach in this handshake is with a TLS client's hello message in which the client declares the set of TLS parameters it supports. Some of these parameters include:
  • The max TLS version it supports (TLS 1.0 - TLS 1.3).
  • A list of cipher suites, that is, the cryptographic algorithm to be used for encryption.
  • A list of supported extensions.

Each client uses a different TLS library which includes Firefox - NSS, Chrome - BoringSSL, Python - OpenSSL, and Safari - Secure Transport, therefore the value of these parameters differs significantly per web client.

Since this message isn't encrypted, we can view it using NSM tools like Wireshark. Below is a TLS client hello message sent by Chrome to Wikipedia, as captured by Wireshark.

TLS client hello message
Click to open the image in fullscreen

The trees, their content and their order, particularly that of the cipher suites differ depending on the web client. Here's what the cipher suite content and order look like for Chrome:

Cipher suite
Click to open the image in fullscreen

The TLS protocol is complex with lots of information, as we can see from the number of extensions in the client hello example above. Each extension contains its own set of parameters, for example some clients support the fake TLS extension, GREASE.

Using all this information, TLS fingerprinting enables the calculation of the TLS signature also known as the "fingerprint" of the web client. The server then uses this signature to infer the client before sending any kind of data, this is where the blocking occurs.

So, how is this signature created? If we understand what goes on behind the scenes to produce this signature, we can execute a TLS fingerprint bypass.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

TLS signature calculation

TLS fingerprinting is based on parameters in the client hello message that is not encrypted, making it visible to anyone on the connection path. By taking the IDs of each parameter in order and hashing the resulting string, we can get a unique fingerprint. The de-facto standard algorithm for generating this is known as JA3.

JA3 works by concatenating the decimal values of the bytes of five fields in the client hello message and then hashing them. The fields are:

TLS version 
Cipher suites 
Extensions 
Elliptic curves 
Elliptic curve point formats

Each field is separated by commas (","), and dashes separate the elements in the fields ("-"). This string is then hashed with MD5 to generate its JA3 fingerprint.

Since JA3 is integrated into Wireshark, we can see our JA3 full string and fingerprint from our previous example.

771,4867-4865-4866-52393-52392-49195-49199-49196-49200-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-18-51-45-43-27-17513-21-41,29-23-24,0

MD5 Fingerprint

fedca33016b974c390faa610378b5a62

In a nutshell, every web client has a unique fingerprint and detecting what type of client is making an HTTPS request can be as simple as matching signatures to a database. Here are some web clients and their signatures:

Firefox 94: 2312b27cb2c9ea5cabb52845cafff423 
Firefox 87: bc6c386f480ee97b9d9e52d472b772d8 
Chrome 97: b32309a26951912be7dba376398abc3b 
Chrome 70: 5353c0796e25725adfdb93f35f5a18f7

How to bypass TLS Fingerprinting

Let's run through a quick tutorial and we'll be using NodeJS. Therefore you need Node and npm installed (some systems have it pre-installed). Install all the necessary libraries by running npm install:

npm init -y 
npm install axios

Ideally, you'd beat TLS fingerprinting by masking your client hello message with a legitimate web client's client hello message. If your request produces the same TLS fingerprint as the latest version of Firefox, no server will spot the difference.

The truth is, it's a lot more complicated. Most fields of a client hello message are not easily controlled or manipulated using scripts or command line tools.

In NodeJS and Python for example, you can control the cipher suites list and in some cases the TLS version but not more. The TLS extension order is pretty much set in place.

When implementing TLS fingerprinting, servers can't operate based on a locked-in whitelist database of fingerprints. New fingerprints appear when web clients or TLS libraries release new versions. So, they have to live off a blocklist database instead.

We can take advantage of this, meaning if we can generate a new fingerprint that isn't in a server's blocklist database, we can bypass its TLS fingerprint blocking.

You can achieve this in Node.js by rearranging the specific cipher order. The default cipher list for Node.js v19 is:

node -p crypto.constants.defaultCoreCipherList | tr ':' '\n' 
TLS_AES_256_GCM_SHA384 
TLS_CHACHA20_POLY1305_SHA256 
TLS_AES_128_GCM_SHA256 
ECDHE-RSA-AES128-GCM-SHA256 
ECDHE-ECDSA-AES128-GCM-SHA256 
ECDHE-RSA-AES256-GCM-SHA384 
ECDHE-ECDSA-AES256-GCM-SHA384 
DHE-RSA-AES128-GCM-SHA256 
ECDHE-RSA-AES128-SHA256 
DHE-RSA-AES128-SHA256 
ECDHE-RSA-AES256-SHA384 
DHE-RSA-AES256-SHA384 
ECDHE-RSA-AES256-SHA256 
DHE-RSA-AES256-SHA256 
HIGH 
!aNULL 
!eNULL 
!EXPORT 
!DES 
!RC4 
!MD5 
!PSK 
!SRP 
!CAMELLIA

If you change the order of the above list in any way, you'll get a new fingerprint. But how should you go about these changes?

If you're familiar with ciphers, you'll notice that the first three ciphers are all highly recommended TLS v 1.3 ciphers, so all modern clients have them as their first option but in different orders.

From our previous Wireshark capture, we can see that Chrome uses these same ciphers as the first options, but in the following order.

Chrome ciphers
Click to open the image in fullscreen

It's safe to leave the first three as is but shuffle the remaining ciphers and you can bypass the TLS fingerprint check using Node.js with this configuration.

const crypto = require('crypto'); 
 
const request = require('request'); 
const https = require('https'); 
 
const nodeOrderedCipherList = crypto.constants.defaultCipherList.split(':'); 
 
	// keep the most important ciphers in the same order 
	const fixedCipherList = nodeOrderedCipherList.slice(0, 3); 
 
	// shuffle the rest 
	const shuffledCipherList = nodeOrderedCipherList.slice(3) 
		.map(cipher => ({ cipher, sort: Math.random() })) 
		.sort((a, b) => a.sort - b.sort) 
		.map(({ cipher }) => cipher);

You can even go further with reordering the first three ciphers but you must be careful. Some cipher list rearranging can compromise your request security. Therefore if you're working on a security-sensitive project, make sure you do your research. Also, bypassing solutions like Akamai fingerprinting can require a bit more work. The goal here is to ensure your fingerprint is not too rare that it gets blocklisted.

Other ways to bypass TLS Fingerprinting

There are other available methods for running a TLS fingerprint bypass. The most popular options include the following:

Headless browsers

When you run a browser in headless mode while scraping, you get the fingerprint of that browser. So the web server sees you as a browser web client.

Python

You can bypass TLS fingerprint detection in Python by spoofing the cipher suite and TLS version using HTTP adapter and requests.

Java

In Java you can reconfigure the list of enabled cipher suites using the ssl-config.enabledCipherSuites method. You can find out more in its documentation.

Go

Go is a programming language that supports JA3 signature faking by spoofing the five fields of the client hello message used by the JA3 algorithm to identify TLS signatures. This is possible using the following Golang libraries like Refraction Networking's utls, or ja3transport.

Conclusion

Although we have gone through some tips on how to scrape a web page without getting blocked, in this tutorial we focused more on bypassing TLS fingerprinting.

We know that escaping TLS fingerprinting is no mean fit, however you can succeed by changing your fingerprint to one that isn't blocklisted.

As a recap, here's how to bypass TLS fingerprint:
  1. Get the default cipher list.
  2. Keep the first three ciphers' orders fixed.
  3. Shuffle the rest.

Remember that even after applying these tips, it might not scale to thousands of websites and you can be blocked. Don't waste time on all of that! At ZenRows, our web scraping API can handle thousands of requests per second whilst bypassing TLS fingerprinting, antibots and CAPTCHA. You can try ZenRows for free today.

Did you find the content helpful? Spread the word and share it on Twitter, LinkedIn, or Facebook.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Want to keep learning?

We will be sharing all the insights we have learned through the years in the following blog posts. If you don't want to miss a piece and keep learning, we'd be thrilled to have us in our newsletter.

No spam guaranteed. You can unsubscribe at any time.