The Anti-bot Solution to Scrape Everything? Get Your Free API Key! ๐Ÿ˜Ž

Canvas Fingerprinting: What Is It and How to Bypass It

March 20, 2023 ยท 7 min read

Canvas fingerprinting is among the smartest and most popular tests to get web scrapers blocked. Let's learn how it works and what to bypass it!

What Is Canvas Fingerprinting?

It's surprising the amount of information you might share when connecting to a website: your operating system, screen resolution, time zone, etc. Even though these seem generic and irrelevant, they create a unique fingerprint (or profile) in combination that identifies you with a high 99.99%+ accuracy.

Canvas fingerprinting is a process inside browser fingerprinting used by about 5.5% of the most popular internet sites that launches graphical challenges to collect many data points, and some of the reasons it's used are:

  • Better security: Identifying bots to prevent malicious attacks. If the website detects any irregularity in the client's behavior, it can act quickly and restrict access.
  • Personalized user experience: Consumers claim they're more inclined to choose a website if it offers a personalized experience. Canvas fingerprinting helps give users suggested content based on previous behavior.

So how does this process work?

How Does Canvas Fingerprinting Work?

On each webpage visit, a specific JavaScript code instructs the browser to draw graphics with random elements and backgrounds on the HTML5 canvas. This image generates the needed information to create a unique user fingerprint. For example, a sample canvas fingerprinting script generates an image like the one below.

Sample Canvas Fingerprint
Click to open the image in full screen

Different computers may render the canvas image above distinctively. This is because factors such as operating systems, image processing engines, and compression levels, among others, vary per device. For example, modern devices with state-of-the-art GPUs and high-screen resolutions use filters like hinting and anti-aliasing to improve images' appearance.

Even if you see the same generated image on different devices, a computer can tell the differences.

Here's a quick breakdown of the steps necessary for a website to generate a canvas fingerprint:

  1. A user visits a website.
  2. The site triggers its JS-based canvas fingerprinting script.
  3. The HTML generates a hidden-to-the-eye image in the browser.
  4. The script creates a Base64 representation of the image based on the client's OS, browser, and GPU.
  5. It then computes a hash of the representation.

You can use BrowserLeaks to see a user's fingerprint:

Fingerprint Example
Click to open the image in full screen

One crucial element of the image above is the PNG hash value, which represents the image data. More importantly, we need to understand hashing (how this value was generated) to bypass canvas fingerprinting.ย 

Hashing is converting data into a fixed-length string of characters without losing uniqueness. Let's see why it's used and how it works next.

How Does Hashing Work in Canvas Fingerprinting?

Data generated by a canvas image is usually large and difficult to store, so that's where a hashing function comes in. It takes the long dataset and reduces it to standardized data, known as a hash.

Hashing is used to generate canvas fingerprints because it produces the same results for the same inputs. Here's an example:

The sentence ''canvas fingerprinting'' in an SHA-256 hashing function will yield the following:

Output
fb2b4c2da0dfaa3bcbf89caf59389d4604739a0490137c970eb55c44c1105f89

However, if we run the same sentence but with a space before the letter c', we'll get a new hash:

Output
620fe0d249aa4d17524ae4c3b3332a8be2913a750bb151bf225794cdcb5ba4c1

You've seen it for yourself! Canvas images might look the same to the human eye, yet they produce different results if there are any variations at the image/system levels. Remember this, as it plays a key role in the bypassing process.

How to Bypass Canvas Fingerprinting?

Ideally, you could scrape without getting blocked by disabling the canvas API. However, this comes with some trade-offs. For example, some websites rely on the canvas to display content. So, without the canvas API, certain features may not work.

Another approach would be to disable JavaScript. Though, most websites depend on it to display content.

Our goal here is different. We want to have a fingerprint that's acceptable to the website. So, how do you go about creating the ''right'' one? Two popular methods are using headless browsers and enabling anti-canvassing extensions.ย 

Remember: any slight variation results in a unique fingerprint. So by altering the canvas image data or the image directly, we generate a new one. Let's see this in action using Puppeteer:

Prerequisites

To follow this tutorial, you need:

  • Node and npm, which you install from the official website.
  • Puppeteer.
Terminal
npm install puppeteer
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Note: The same result can be achieved using other languages and headless browsers.

Method #1: Bypass Canvas Fingerprinting Using Base Puppeteer

How can you simulate fingerprints? You need to identify the script responsible for canvas fingerprinting and replace them with an adapted version.

Let's assume we're required to draw a 209px by 25px image and go ahead to generate a fake fingerprint for this case.

Firstly, import Puppeteer and launch a new browser instance.

program.js
// Import the Puppeteer library
const puppeteer = require('puppeteer');


(async () => {
  // Launch a new browser instance and create a new page
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();

Next, define a function to modify the toDataURL() method of the HTMLCanvasElement prototype. This function collects the original toDataURL() method, checks if the website draws an image with the pre-stated dimensions, and returns a fake fingerprint. Also, it should be called when a new document is created and before any script is executed.

program.js
 await page.evaluateOnNewDocument(() => {
        const mainFunction = HTMLCanvasElement.prototype.toDataURL;
        HTMLCanvasElement.prototype.toDataURL = function (type) {
            // check if this is a fingerprint attempt
            if (type === 'image/png' && this.width === 209 && this.height === 25) {
                // return fake fingerprint
                return '';
            }
            // otherwise, just use the main function
            return mainFunction.apply(this, arguments);
        };

So, your complete code should look like this:

program.js
const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();

    await page.evaluateOnNewDocument(() => {
        const mainFunction = HTMLCanvasElement.prototype.toDataURL;
        HTMLCanvasElement.prototype.toDataURL = function (type) {
            // check if this is a fingerprint attempt
            if (type === 'image/png' && this.width === 209 && this.height === 25) {
                // return fake fingerprint
                return '';
            }
            // otherwise, just use the main function
            return mainFunction.apply(this, arguments);
        };
    });
    await page.goto('https://browserleaks.com/canvas');
})();

Here's our result:

Fake Fingerprint
Click to open the image in full screen

Generally, any image results in a new fingerprint. However, if yours is too rare, you might still get blocked. Thus, a safer bet would be to imitate a legitimate browser's image. You can copy one or create it yourself with an image editor of your choice.

Method #2: Enable Anti-canvassing Extensions

If you can't mimic an actual browser's image, there's another approach: Downloading and enabling anti-canvassing extensions using a headless browser.

Chrome's Canvas Fingerprint Defender is a good option as it generates a random fake fingerprint close to that of a real user. Let's do a quick test, without the extension first:

program.js
const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();

    await page.goto('https://browserleaks.com/canvas');
})();

Our base Puppeteer script produces a fingerprint with 99% uniqueness, as in the image below. While this fingerprint is highly unique, you can still get blocked alongside 3083 Puppeteer-generated fingerprints that may have been blocklisted as bots.

Puppeteer-generated Fingerprint
Click to open the image in full screen

What happens when we add the extension, though?

Download the extension in your browser, then head to your code editor and add the path to the extension.

program.js
const puppeteer = require('puppeteer');
 
(async () => {
    // Path to extension folder--- replace with your extension path
    const pathToExtension = 'C:\\Users\\Chesc\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Extensions\\lanfdkkpgfjfdikkncbnojekcppdebfp\\0.2.0_0';

Note: Chrome plugins are usually stored in the browser. To access it on your local machine, get the extension's ID from your browser, then navigate to the folder where Chrome stores them on your device and locate the folder with the same ID.

Then enable the extension.

program.js
try {
        console.log('==>Open Browser');
        const browser = await puppeteer.launch({
            // Disable headless mode
            headless: false,
            // Pass the options to install the extension
            args: [
                `--disable-extensions-except=${pathToExtension}`,
                `--load-extension=${pathToExtension}`,
                ]
        });

Our complete code should look like this:

program.js
const puppeteer = require('puppeteer');

(async () => {
    // Path to extension folder--- replace with your extension path
    const pathToExtension = 'C:\\Users\\Chesc\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Extensions\\lanfdkkpgfjfdikkncbnojekcppdebfp\\0.2.0_0';
    try {
        console.log('==>Open Browser');
        const browser = await puppeteer.launch({
            // Disable headless mode
            headless: false,
            // Pass the options to install the extension
            args: [
                `--disable-extensions-except=${pathToExtension}`,
                `--load-extension=${pathToExtension}`,
                ]
        });

        const page = await browser.newPage();
        // Navigate to browser leaks 
        await page.goto('https://browserleaks.com/canvas');


    }
    catch (err) {
        console.error(err);
    }
})();

And below is our result:

Unique Fingerprint
Click to open the image in full screen

Bingo! You have a 100% unique fingerprint.

This is what the Defender does behind the scenes to randomize your fingerprint:

Example
const getImageData = CanvasRenderingContext2D.prototype.getImageData;

const noisify = function (canvas, context) {
  const width = canvas.width, height = canvas.height;
  // ... noisify imageData ...
  context.putImageData(imageData, 0, 0);
};

Object.defineProperty(CanvasRenderingContext2D.prototype, "getImageData", {
  value: function () {
    noisify(this.canvas, this);
    return getImageData.apply(this, arguments);
  },
});

// Something similar for HTMLCanvasElement.toBlob and HTMLCanvasElement.toDataURL

Canvas fingerprinting retrieves information using three functions: toBlob(), getImageData(), and toDataURL(). However, the script injects JavaScript into the document to monitor and alter the behavior of the included programs. In other words, the functions get redefined in the code. So when a website calls them to create a fingerprint, it receives randomly generated data.

Redefining the toBlob() and toDataURL() functions alters the HTML canvas element data, while getImageData() changes the rendering interface.

Conclusion

Canvas fingerprinting is easy for websites to implement, and some massively used anti-bot systems like Cloudflare and DataDome come with it, but it's challenging for scrapers to bypass.

Here's a quick recap of everything you learned today:

  • What canvas fingerprinting is and how it works.
  • How to bypass it using base Puppeteer.
  • How to create fake fingerprints enabling anti-canvassing extensions.

While those methods could bring you success in some cases, they still pose certain risks and limitations. To spare yourself time and effort and ensure you reach your scraping goals without any obstacles, check out ZenRows, a web scraping API that works with Python, NodeJS, PHP, Java, Golang and any other language. Sign up now and get 1,000 free successful requests.

Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

The easiest way to do Web Scraping

From Rotating Proxies and Headless Browsers to CAPTCHAs, a single API call to ZenRows handles all anti-bot bypass for you.