The Anti-bot Solution to Scrape Everything? Get Your Free API Key! ๐Ÿ˜Ž

How to Bypass "Please Verify You Are a Human"

October 14, 2023 ยท 4 min read

What Is "Please Verify You Are a Human" from PerimeterX

The "Please verify you are a human" message means the website owner wants you to confirm you're a real person, not a bot. This action prevents the access of malicious automated systems. Unfortunately, it may also appear when scraping a website using tools like Selenium or Playwright.

To verify, you'll be given a task that's easy for humans but hard for bots, such as solving a visual puzzle, answering a question, or performing a specific action. Bypassing PerimeterX (known as HUMAN nowadays) typically requires the "Press & Hold" action, and the screen will look similar to the one below:

PerimeterX Verification Screen
Click to open the image in full screen

How Do I Avoid "Please Verify You Are a Human" from PerimeterX

You can avoid this challenge by relying on the following techniques:

A. Basic Solution: Press and Hold with a Headless Browser

A headless browser for scraping lacks the graphical user interface (GUI) but allows automated interaction with web pages, such as submitting forms.

Since the PerimeterX error is typically seen during the initial request to the website, a headless browser such as Selenium can mimic a press-and-hold action. Let's look at a generic implementation.

First, set up your Selenium driver in headless mode and navigate to your preferred URL. The press-and-hold button will most likely be injected programmatically into an iframe element, which we have to ensure is available before selecting the button. Then, use Selenium's Action API to simulate press-and-hold actions for about ten seconds.

program.py
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
import selenium.webdriver.support.expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
import time

# create ChromeOptions object
options = webdriver.ChromeOptions()
options.add_argument("--headless") # Headless mode

# create a new Chrome webdriver instance, passing in the options object
driver = webdriver.Chrome(options=options)
driver.get("your_url")

# Switch to iframe that directly houses pX "Press & Hold" button
WebDriverWait(driver, timeout=300).until(EC.frame_to_be_available_and_switch_to_it("iframe_name_or_id"))
# Get button element
btn = driver.find_element(By.XPATH, "//xpath_to_button")
# Initialize for low-level interactions
action = ActionChains(driver)
action.click_and_hold(btn)
# Initiate click and hold action on button
action.perform()
# Keep holding for 10s
time.sleep(10)
# Release button
action.release(btn)

# ...continue scraping

driver.quit()

Great! Did it work in your case? If not, keep reading.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

In some scenarios, you may be unable to select the press-and-hold button. In such cases, PerimeterX injects the button into a random iframe amongst other iframe siblings and renders them all in a closed shadow DOM. That makes the content of the shadow DOM unchangeable using JavaScript and inaccessible to your Selenium driver.

Fortunately, there's a way out! Even though the button is in a closed shadow DOM, that doesn't stop it from being focused on. So, we can leverage keystrokes to trigger a browser focus on the button using the Tab key and simulate pressing it with the Enter key.

Let's test this out on Fiverr, as it implements its press-and-hold button that way. First, import the necessary dependencies and set up your Selenium driver:

program.py
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
import time

driver = webdriver.Chrome()
driver.get('https://www.fiverr.com')

Fiverr has a list of categories on its Homepage that link to their respective category pages. Assuming we want to scrape the content of the first category page, we can select the first category link and mimic a click on it.

Category "Link" on Fiverr
Click to open the image in full screen
program.py
# ...

try:
    link = driver.find_element(By.CSS_SELECTOR, 'ul.categories-list > li:first-child > a')
except:
    link = None

if link:
    link.click() # Triggers a navigation

# Implicitly wait for 10 seconds (for navigation to finish)
driver.implicitly_wait(time_to_wait=10)

Because clicking on the link triggers navigation, we've got to implicitly wait for some time, 30 seconds in this case. During the navigation, Fiverr will likely redirect us to the challenge page. A div with an id ofย  px-captcha wraps the button and shadow DOM, so its availability after navigation means we've been sent to the challenge page.

If we're on the challenge page, we can send the Tab and Enter strokes like this:

program.py
# ...

btn_container = driver.find_element(By.ID, 'px-captcha')
if (btn_container):
    # we've been blocked so run bypass logic
    print('PRESS & HOLD container found')
    driver.implicitly_wait(10)
    ActionChains(driver).key_down(Keys.ENTER)\
        .pause(5)\
        .send_keys(Keys.TAB)\
        .pause(5)\
        .key_down(Keys.ENTER)\
        .pause(10)\
        .key_up(Keys.ENTER)\
        .perform()
else:
    print('PRESS & HOLD container not found')

time.sleep(100) 

# ...continue scraping

driver.quit()

See below it worked!

Press and Hold Simulation
Click to open the image in full screen

While that does help with the press-and-hold problems, we may still be detected as bots by PerimeterX. To avoid that, we can use supplementary libraries, which leads us to the next solution.

B. Pro Solution: Anti-bot Bypass Tools

Because PerimeterX uses a combination of fingerprinting and behavior analysis, the "Please verify that you are human" challenge could come up at any time when scraping. However, we could strengthen our headless browser by fixing the known vulnerabilities. For example, extensions like the Undetected ChromeDriver for Selenium or the Puppeteer Stealth plugin will decrease the likelihood of anti-bot software identifying the browser.

Keep in mind the security level for PerimeterX differs from site to site. Namely, the press-and-hold button may be difficult to select, or you may face additional protection from PerimeterX alternatives. As a result, keeping up with bypass strategies can take time and effort.

Therefore, your best bet is to employ a premium solution like ZenRows to handle the complexity of getting blocked while web scraping and bypass the challenge. ZenRows includes premium proxies and everything needed to get around PerimeterX, which saves you hundreds of development hours. In addition, it's cheaper than implementing all the tools for bypassing individually. Sign up to get your 1,000 free API credits now.

Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

The easiest way to do Web Scraping

From Rotating Proxies and Headless Browsers to CAPTCHAs, a single API call to ZenRows handles all anti-bot bypass for you.