How to Bypass Bot Detection When Web Scraping

Updated: June 3, 2026 · 11 min read

Table of contents

How anti-bot systems decide which request to block
1. Rotate proxies and manage IP reputation
2. Set correct HTTP Headers
3. Mimic real user behavior
4. Bypass bot detection with fortified headless browsers
- Playwright Stealth
- SeleniumBase with Undetected ChromeDriver
When to use a managed solution
Choosing the right tool for anti-bot bypass
Bypass guides by anti-bot system
Quick reference: Which technique addresses which detection method
Conclusion

Anti-bot systems detect scrapers by assigning a trust score to every incoming request based on your IP address, TLS handshake, browser fingerprint, HTTP headers, and page-level behavior. A request that fails enough of those checks gets blocked, rate-limited, or served a CAPTCHA even before your scraping logic runs.

Understanding which signals each system checks and which techniques address them is what separates scrapers that work reliably from those that break unpredictably. This guide covers both.

How Anti-bot Systems Decide which Request to Block

Anti-bot systems don't block you based on a single signal, but calculate your trust score from multiple detection layers. And each security system weighs those signals differently. You might be surprised that a residential IP with a perfect browser fingerprint but a Python TLS handshake might pass Imperva and fail Akamai. A web scraper with correct headers and a matching TLS fingerprint but no mouse movement might pass Cloudflare's basic checks and fail DataDome's behavioral analysis.

Every major anti-bot system checks the same six detection layers: IP reputation, TLS fingerprinting, browser fingerprinting, HTTP headers, behavioral analysis, and JavaScript challenges.

IP reputation: The first check evaluated before any other analysis. Shared and datacenter IPs are easily recognizable by every major anti-bot provider, and a flagged IP gets an immediate negative score regardless of how clean the rest of the fingerprint is.
TLS / JA3 fingerprint: Analyzed during the HTTPS handshake before your scraper loads the target page. Every HTTP client library, including Python's Requests, httpx, Node's Axios, and Go's net/http, has a distinct JA3 hash that differs from that of a real browser like Chrome. A claimed Chrome User-Agent with a Python requests JA3 hash is an immediate mismatch regardless of which anti-bot system you're hitting.
Browser fingerprint: Collected on the client-side via JavaScript after you've loaded the target page, and it includes navigator.webdriver, WebGL renderer, canvas hash, audio context, installed fonts, screen dimensions, and dozens of other properties. Anti-bots probe for missing APIs and rendering anomalies that headless browsers exposes by default. This is the signal that stealth browsers and tools like Camoufox are specifically designed to address.
HTTP headers: Anti-bots check the request headers for the presence of certain values, their order, and internal consistency. If a request misses the headers that real browsers always send (e.g., Accept-Language, Sec-Fetch-Dest, Sec-Fetch-Mode), it raises the risk score. Another important signal is header mismatch. For instance, the sec-ch-ua Client Hints header must match the Chrome version in the User Agent string. A mismatch between these two is a common trigger for detection. Major web application firewalls (WAFs), including Akamai, Cloudflare, DataDome, and PerimeterX, analyze HTTP/2 frame ordering, which most HTTP libraries get wrong even when individual headers are correctly set.
Behavioral analysis: Monitors mouse movement, scroll depth, click timing, page dwell time, and navigation sequences. A scraper that loads a page and immediately extracts data without any prior interaction produces a behavioral signature that looks nothing like a real user. This layer is harder to address than the others because it requires active simulation rather than static configuration. Modern behavioral detection also runs per-customer ML models trained on each site's specific traffic patterns. For example, DataDome's models process millions of signals per day and can flag scrapers based on navigation patterns alone, even when all technical fingerprints are correctly spoofed. A bypass technique that works on one target can fail on another protected by the same system because the models are trained on different baselines.
JavaScript challenges: These are active tests served to the browser rather than passive signal collection. They include proof-of-work puzzles and invisible CAPTCHAs. They execute in the browser and produce tokens that must be returned with subsequent requests. Examples are Cloudflare's cf_clearance and PerimeterX's _pxAppId cookies. Standard HTTP clients cannot generate these tokens. They require a full browser execution environment or a managed solver.

The next sections cover a bypass technique for each layer, in the order that matters.

1. Rotate proxies and manage IP reputation

You IP address is the first metric that anti-bot systems check, even before your browser fingerprint or headers are evaluated. Since Datacenter IP ranges from cloud providers like AWS, Google Cloud, DigitalOcean, and more are logged by every major anti-bot provider, sending requests from these immediately raises your risk score regardless of how clean other signals are.

Proxy types and when to use them:

Residential proxies: These are IPs assigned by ISPs to home users and are the most trusted IP ranges. Use residential proxies for any target with medium or hard protection.
Mobile proxies: Since mobile towers share and recycle IPs across thousands of users, anti-bot systems are cautious about blocking them to avoid false positives. They're an effective option for heavily protected targets, but the most expensive.
Datacenter proxies: These are fast and cheap, but easily flagged. They're only recommended for targets with basic or zero bot protection.

IP Rotation is an essential feature when configuring proxies for web scraping. Sending multiple requests from the same IP, even a residential one, creates a pattern that anti-bot systems flag over time. IP Rotation means cycling through a pool of IPs so each request appears to come from a different user. You can implement this manually by maintaining a list of proxy addresses and selecting one per request, but most residential proxy providers handle rotation automatically via a single rotating endpoint, eliminating the need for pool management on your end.

Here's a basic proxy rotation script in Python:

                    Example
                
# pip install requests
import requests
import random


proxies = [
    "http://<USERNAME>:<PASSWORD>@residential-proxy1.example.com:8080",
    "http://<USERNAME>:<PASSWORD>@residential-proxy2.example.com:8080",
    "http://<USERNAME>:<PASSWORD>@residential-proxy3.example.com:8080",
]


def get_proxy():
    return random.choice(proxies)


def scrape(url):
    proxy = get_proxy()
    response = requests.get(
        url,
        proxies={"http": proxy, "https": proxy},
        timeout=30,
    )
    return response.text


result = scrape("https://www.scrapingcourse.com/antibot-challenge")
print(result)

  
  

  
Copied!

Frustrated that your web scrapers are blocked once and again?

ZenRows API handles rotating proxies and headless browsers for you.

Try for FREE

2. Set correct HTTP Headers

Anti-bot systems inspect the HTTP request headers for missing values or strings. A missing Accept-Language header, an Accept-Encoding that doesn't match what a real browser like Chrome would send, or a User Agent that doesn't match the sec-ch-ua Client Hints header are all detection signals.

The User Agent is the most commonly set header, but also the most commonly misconfigured. Most scraping tools leave bot-like traces in their default User Agent. For example, Python's requests library identifies itself like this by default:

                    Example
                
{
    "User-Agent": "python-requests/2.31.0",
}

Copied!

That string alone is enough to trigger a block on most protected targets. But simply replacing it with a real browser's User Agent isn't sufficient either, as it needs to be consistent with the Client Hints headers that real browsers like Chrome send alongside the User Agent. For instance, sending a Chrome 148 User Agent with no `sec-ch-ua` header, or with a `sec-ch-ua` that references a different Chrome version, is a fingerprint mismatch that can lead to detection.

That said, even the complete header set of regular HTTP clients like Requests is incomplete. The Requests library header looks like this:

                    Example
                
{
  "headers": {
    "Accept": [
      "*/*"
    ],
    "Accept-Encoding": [
      "gzip, deflate"
    ],
    "Connection": [
      "keep-alive"
    ],
    "Host": [
      "httpbin.io"
    ],
    "User-Agent": [
      "python-requests/2.33.1"
    ]
  }
}

  
  

  
Copied!

These are clear bot signals that result in the instant blocking of your scraping requests, making standard HTTP clients unfit to tackle anti-bot measures alone.

When using standard HTTP clients, it's best to configure it with a complete real browser header. This makes your request look more legitimate and more likely to bypass simple detection mechanisms that instantly detect bare clients. The Python script below passes Chrome's request headers into Requests:

                    Example
                
# pip3 install requests
import requests


headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "Accept-Encoding": "gzip, deflate, br, zstd",
    "Accept-Language": "en-US,en;q=0.9",
    "Connection": "keep-alive",
    "DNT": "1",
    "Sec-Ch-Ua": '"Chromium";v="148", "Google Chrome";v="148", "Not/A)Brand";v="99"',
    "Sec-Ch-Ua-Mobile": "?0",
    "Sec-Ch-Ua-Platform": '"Windows"',
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "none",
    "Sec-Fetch-User": "?1",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36",
}


# inspect the headers your request sends 
response = requests.get(
    "https://httpbin.io/headers",
    headers=headers,
)
print(response.text)

  
  

  
Copied!

However, while the above reduces the chances of detection, merely passing a complete header set into a standard HTTP client doesn't guarantee you won't get blocked.

3. Mimic Real User Behavior

Anti-bot systems deploy algorithms that monitor user behavior, including clicks, mouse movements, scrolling patterns, and navigation flow. Web scraping typically has a predictable flow even when using automation tools. Predictable actions like clicking the same element repeatedly, visiting several pages simultaneously, rapidly filling out forms and scrolling to the exact same height every time deviate from natural usage and trigger behavioral flags.

One way to reduce detection is to randomize web interactions to mimic how a real user browses. Here are the key behaviors to vary:

Scroll to a different depth on each page rather than always scrolling to the same position.
Click different elements rather than targeting the same selector every time.
Use randomized delays between actions rather than fixed intervals.
Add exponential backoff on retries rather than retrying at uniform intervals.
Mix in occasional non-target page visits to break predictable navigation sequences.

The following Playwright example shows a realistic scraping pattern with random delays, scroll simulation, and mouse movement:

                    Example
                
# pip3 install playwright
# playwright install chromium
import asyncio
import random
from playwright.async_api import async_playwright

async def human_scroll(page):
    # scroll to a random depth between 30% and 80% of the page
    scroll_depth = random.uniform(0.3, 0.8)
    page_height = await page.evaluate("document.body.scrollHeight")
    target = int(page_height * scroll_depth)
    await page.evaluate(f"window.scrollTo(0, {target})")
    # pause at the scroll position as a human would
    await asyncio.sleep(random.uniform(0.8, 2.0))

async def human_delay():
    # random delay between actions, 2 to 8 seconds
    await asyncio.sleep(random.uniform(2.0, 8.0))

async def scraper():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()

        # set a realistic viewport
        await page.set_viewport_size({"width": 1366, "height": 768})

        await page.goto("https://www.scrapingcourse.com/ecommerce")

        # simulate initial page read behavior
        await human_scroll(page)
        await human_delay()


        # move mouse to a random position before interacting
        await page.mouse.move(
            random.randint(100, 1200),
            random.randint(100, 600),
        )
        await asyncio.sleep(random.uniform(0.3, 0.8))


        # extract the target content
        content = await page.inner_text("body")
        print(content)

        await browser.close()

asyncio.run(scraper())

  
  

  
Copied!

Keep in mind that while this technique helps bypass basic behavioral checks, it's insufficient against advanced anti-bot systems that combine behavioral analysis with fingerprint and TLS detection. Combine it with the other techniques for reliable results. You can also simulate user interactions with other headless browsers, such as Puppeteer and Selenium.

4. Bypass Bot Detection with Fortified Headless Browsers

Conventional headless browsers like Selenium, Playwright, and Puppeteer are prone to anti-bot detection if you don't modify their default configurations. They present bot-like attributes, such as a WebDriver automation flag in the navigator field, a HeadlessChrome User Agent flag in headless mode, and the absence of specific browser attributes used for fingerprinting.

While you can manually patch these missing pieces, the process can be time-consuming and technical. Fortunately, these headless browsers have third-party stealth plugins or fortified versions that remove bot flags and reduce the risk of detection.

Fortified browsers that run a real browser automatically produce a genuine browser TLS fingerprint and handle the TLS detection layer without any extra configuration. Standard HTTP clients like Requests each have a distinct JA3 hash that differs from those of real browsers like Chrome. So switching to a fortified headless browser also solves TLS detection as a side effect.

For bypassing bot detection on protected targets, two tools are worth knowing, each suited to a different protection level:

Playwright Stealth

Playwright Stealth patches JavaScript-layer fingerprints, such as navigator.webdriver, User Agent, plugins, and related signals. It works against basic to intermediate fingerprint checks and is a reasonable starting point for lightly protected targets. However, it doesn't work well for bypassing Cloudflare, DataDome, or any system that combines TLS fingerprinting with JavaScript challenges. See our full Playwright Stealth guide for setup and configuration.

                    Example
                
# pip3 install playwright-stealth
# playwright install chromium

import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import Stealth

async def scraper():
    async with Stealth().use_async(async_playwright()) as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        await page.goto("https://www.scrapingcourse.com/antibot-challenge")
        print(await page.inner_text("body"))
        await browser.close()
asyncio.run(scraper())

  
  

  
Copied!

SeleniumBase with Undetected ChromeDriver

SeleniumBase with Undetected ChromeDriver in UC mode handles Cloudflare and similar JavaScript challenges. The uc_open_with_reconnect method manages the reconnection timing these systems use to verify browser authenticity, and uc_gui_click_captcha handles visible CAPTCHA challenges automatically.

The example below visits a Cloudflare-protected site with SeleniumBase and uses its CAPTCHA checkbox clicking functionality to simulate CAPTCHA clicking to bypass the interstitial page:

                    Example
                
# pip3 install seleniumbase
from seleniumbase import Driver

driver = Driver(uc=True, headless=False)

url = "https://www.scrapingcourse.com/antibot-challenge"
driver.uc_open_with_reconnect(url, reconnect_time=6)
driver.uc_gui_click_captcha()

driver.save_screenshot("screenshot.png")
driver.quit()

Copied!

Note

SeleniumBase UC mode works best with headless=False. Running headless reduces its effectiveness against Cloudflare specifically.

A caveat that applies to both Playwright Stealth and SeleniumBase is that their evasion techniques are publicly visible and anti-bot providers actively monitor and update their detection to counter them. A configuration that works now may fail after an anti-bot updates its security measures. This is the core tradeoff of open-source stealth tooling; while effective for some protected targets, they require ongoing maintenance attention.

For targets that are too heavily protected for open-source tools, or when you need reliable coverage across many targets without ongoing maintenance, a managed solution is the right choice.

When to Use a Managed Solution

The easiest and most reliable way to avoid anti-bot detection sustainably is to use a web scraping solution like the ZenRows Universal Scraper API. It provides all the toolkits required to avoid getting blocked.

With ZenRows' Adaptive Stealth Mode, you get the most cost-effective, smartest setup for successful scraping at any scale. This minimizes maintenance overhead and enables you to focus on core data refinement and analysis rather than wasting development time and resources on fixing failed requests.

Let's see how ZenRows works with a heavily protected site like the Antibot Challenge page.

To start, sign up for free to open the Universal Scraper API Request Builder. Paste your target URL in the address box. Then, activate the Adaptive Stealth Mode.

building a scraper with zenrows — Click to open the image in full screen

Choose your programming language (Python, in this case) and select the API connection mode. Copy the generated code and paste it into your script.

The generated Python code looks like this:

                    scraper.py
                
# pip3 install requests
import requests

url = "https://www.scrapingcourse.com/antibot-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
    "url": url,
    "apikey": apikey,
    "mode":"auto",
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)

  
  

  
Copied!

The above code accesses the protected site and extracts its full-page HTML, as shown:

                    Output
                
<html lang="en">
<head>
    <!-- ... -->
    <title>Antibot Challenge - ScrapingCourse.com</title>
    <!-- ... -->
</head>
<body>
    <!-- ... -->
    <h2>
        You bypassed the Antibot challenge! :D
    </h2>
    <!-- other content omitted for brevity -->
</body>
</html>

  
  

  
Copied!

You can now confidently bypass any anti-bot measure with only a few code lines using ZenRows.

Choosing the Right Tool for Anti-bot Bypass

The right tool for bypassing anti-bots when scraping depends on what's blocking you and how much maintenance overhead you're prepared to absorb. Use this as a quick reference:

Tool	Best for	Limitations	Maintenance
playwright-stealth	Basic to intermediate fingerprint checks, lightly protected targets	Fails against Cloudflare, DataDome, PerimeterX, TLS-based systems	Active for Python; Node.js version is unmaintained (v2.x available in Python)
SeleniumBase UC	Cloudflare, intermediate JS challenges	Works best in non-headless mode	Active
ZenRows Universal Scraper API	Any protection level, no maintenance overhead	Paid	Fully managed

Bypass Guides by Anti-bot System

The techniques above address detection layers that all anti-bot systems share. But as mentioned, each system weights those layers differently. If you already know which system is blocking you, jump straight to the relevant guide below.

However, even a 403 or challenge page doesn't always make it obvious what's blocking you. If you're not sure about what's blocking you, we recommend scanning your target URL directly in the ZenRows playground. Just paste your target site into the URL box and send an initial request, and it will identify the anti-bot system in use. This lets you see the anti-bot system blocking you before choosing your approach.

If you're seeing a 403 error specifically, see our guide on bypassing 403 errors in web scraping for help diagnosing the root cause.

Here's a breakdown of the most common anti-bot systems, their primary defense mechanisms, and links to dedicated bypass guides:

Anti-bot system	Difficulty	Primary defense	Full guide
Cloudflare	Medium–Hard	JS challenges + Turnstile + TLS	Bypass Cloudflare
DataDome	Hard	AI behavioral analysis	Bypass DataDome
Akamai	Hard	JA3 fingerprint + HTTP/2 frame order	Bypass Akamai
PerimeterX / HUMAN	Hard	Multi-layer trust scoring	Bypass PerimeterX (HUMAN Security)
Imperva	Medium	IP reputation + JS challenges	Bypass Imperva
Kasada	Medium–Hard	JA3 + proof-of-work challenges	Bypass Kasada
Sucuri	Low–Medium	IP reputation + WAF rules	Bypass Sucuri

Cloudflare: It combines JavaScript challenges, Turnstile CAPTCHA, and TLS analysis. Turnstile specifically requires a managed solution or a specialized solver. You can't simply bypass it with headers or stealth patches alone.

DataDome: This one uses AI-powered behavioral analysis that processes real-time interaction signals. Its per-customer ML models are trained on each site's specific traffic patterns, meaning a bypass technique that works on one DataDome-protected target may fail on another.

Akamai Bot Manager: It focuses heavily on JA3 fingerprinting and HTTP/2 frame ordering, checking both TLS and browser fingerprint signals together.

PerimeterX (HUMAN): This anti-bot uses multi-layer trust scoring that applies all detection methods aggressively. Neither a single technique nor a single tool is typically sufficient to bypass it.

Imperva: It focuses primarily on IP reputation and JavaScript challenges, making it one of the more approachable systems in this list.

Kasada: This uses JA3 fingerprinting combined with proof-of-work JavaScript challenges. A standard Python HTTP client won't pass Kasada regardless of how correct the headers are.

Sucuri: It's primarily a WAF and CDN that blocks scrapers based on IP reputation and rule-based filtering. It's generally less sophisticated than the systems above, making it one of the more accessible to bypass with correct headers, residential proxies, and standard browser automation.

Quick Reference: Which Technique Addresses Which Detection Method

Use this table to match your target's detection signals to the technique that addresses them.

Signal	Proxy rotation	Correct headers	Behavioral simulation	Fortified headless browser	Managed API
IP reputation	✅	N/A	N/A	N/A	✅
TLS / JA3 fingerprint	N/A	N/A	N/A	✅	✅
HTTP headers	N/A	✅	N/A	✅	✅
Browser fingerprint	N/A	N/A	N/A	✅	✅
Behavioral analysis	N/A	N/A	✅	Partial	✅
JavaScript challenges	N/A	N/A	N/A	✅	✅

Conclusion

No single technique covers all signals. For an open-source solution, the most reliable scrapers combine at least three anti-bot bypass techniques: residential proxies to maintain IP reputation, a fortified browser to mitigate TLS and fingerprinting signals, and randomized request timing to evade behavioral analysis.

For targets protected by more advanced anti-bot systems such as Cloudflare, DataDome, PerimeterX, and Akamai, add full behavioral simulation on top of the three. And if the maintenance overhead of keeping up with detection updates isn't something you want to absorb, the ZenRows Universal Scraper API handles all of it automatically. It's the most reliable solution for bypassing any anti-bot measure, regardless of its complexity.

Try ZenRows for free now or speak with sales!

Frequent Questions

Why am I being detected as a bot?

You may be detected as a bot due to:

Repeated requests from the same IP address.
Bot-like User Agent headers or missing browser attributes.
Predictable interaction patterns, such as rapid clicks or scrolling.
Using unpatched headless browsers with automation flags.

How can I disable bot detection?

You cannot disable bot detection. However, you can reduce the chances of being detected by:

Using rotating proxies to avoid IP bans.
Mimicking human behavior by randomizing interactions like clicks and scrolls.
Using fortified headless browsers like SeleniumBase or Puppeteer Stealth.
Setting custom and consistent User Agents to avoid detection.

What are the best tools to avoid bot detection?

The best tools include:

ZenRows Universal Scraper API: Offers premium rotating proxies and top-notch anti-bot bypass.
SeleniumBase with Undetected ChromeDriver: Fortifies Selenium for better evasion.
Puppeteer Stealth Plugin: Reduces bot-like signals in Puppeteer.

How do proxies help bypass bot detection?

Proxies mask your real IP by routing requests through different addresses. Using rotating residential proxies ensures each request appears to come from a unique, legitimate user, reducing the chances of detection and bans.

How can I mimic human behavior while scraping?

To mimic human behavior:

Randomize interaction patterns like clicks, scrolls, and delays.
Use headless browsers to simulate real user actions.
Add pauses and retries to mimic natural browsing behavior.

What does bot detection mean?

Bot detection is a system that websites employ to identify and block automated requests, such as web scraping. Websites use various bot-detection services, including CAPTCHA and WAFs, such as Cloudflare, Akamai, DataDome, and many others.

These bot-detection services use various methods, such as fingerprinting, IP monitoring, and more, to screen incoming traffic. Once a request is flagged as being bot-like, the anti-bot blocks it from accessing the target website. The best way to avoid bot detection is to use dedicated web scraping APIs.

Is it legal to scrape an anti-bot-protected website?

Scraping public data is generally legal. However, it's important to use scraped data ethically to avoid legal actions. Additionally, avoid scraping private data, such as personal information behind login walls. Overall, ensure you comply with applicable data privacy laws and follow best practices for web scraping.