You'll often encounter CAPTCHAs while scraping with Scrapy. This will always get you blocked if you don't find a way to bypass them.
In this article, you'll learn the different ways of bypassing CAPTCHAs in Scrapy.
Can you Solve CAPTCHAs with Scrapy?
You can solve CAPTCHAs in Scrapy using three methods. These include using a web scraping API, employing a CAPTCHA resolver, or rotating premium proxies.Â
A Web scraping API helps you avoid CAPTCHAs and other anti-bot protections completely. You can rotate premium proxies to prevent CAPTCHA services from flagging your IP address. And CAPTCHA resolvers work by passing your request to a dedicated CAPTCHA-solving service or a human.
All these methods help you focus on data extraction without worrying about getting blocked. The next sections show you how to implement each in full detail.
Method #1: Bypass any CAPTCHA with a Web Scraping API
As powerful as Scrapy is for web scraping, it can be blocked by CAPTCHAs or other anti-bot protection. The best solution to bypass CAPTCHAs is to use a web scraping API like ZenRows. It provides everything you need to avoid CAPTCHA challenges, including premium proxy rotation, JavaScript rendering capabilities, automatic header management, browser fingerprint randomization, and more. Let's see how ZenRows performs against a protected page like the Antibot Challenge page.
Start by signing up for a new account, and you'll get to the Request Builder.

Paste the target URL, enable JS Rendering, and activate Premium Proxies.
Next, select Python and click on the API connection mode. Then, copy the generated code and paste it into your script.
# pip3 install requests
import requests
url = "https://www.scrapingcourse.com/antibot-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
"url": url,
"apikey": apikey,
"js_render": "true",
"premium_proxy": "true",
}
response = requests.get("https://api.zenrows.com/v1/", params=params, print(response.text)
The generated code uses Python's Requests library as the HTTP client. You can install this library using pip:
pip3 install requests
Run the code, and you'll successfully access the page:
<html lang="en">
<head>
<!-- ... -->
<title>Antibot Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Antibot challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Congratulations! 🎉 You’ve successfully bypassed the anti-bot challenge page using ZenRows. This works for any website.
Method #2: Use a CAPTCHA Resolver
You can solve CAPTCHAs in Scrapy with CAPTCHA-resolving services. Most solving services like 2CAPTCHA employ human solvers, and the request might take some time. Â
The approach to solving a CAPTCHA with 2CAPTCHA depends on the CAPTCHA provider. While most CAPTCHAs require the target website's site key, others require you to download and upload the CAPTCHA image to the 2CAPTCHA server.
You'll solve the reCAPTCHA demo on the 2CAPTCHA website to see how that works. Here's the unsolved CAPTCHA.Â

To solve that with 2CAPTCHA, install the solver package using pip
:
pip install 2captcha-python
You need two things to solve the reCAPTCHA CAPTCHA. These include your 2CAPTCHA API key and the target website's site key.Â
Sign up on the 2CAPTCHA website and grab your API key from your dashboard.

You'll find the site key in the target website's HTML. Launch the demo website on a browser like Chrome and right-click on the CAPTCHA box. Then click "Inspect". Expand the outer element and look for the data-sitekey
attribute, as shown below:

It's time to write your web scraping code.Â
Start your spider class by defining a function that solves the reCAPTCHA CAPTCHA. Pass your API key with the 2CAPTCHA instance and use the recaptcha
method to solve the CAPTCHA based on the site key and the target URL.
# import the required libraries
import scrapy
from twocaptcha import TwoCaptcha
class TutorialSpider(scrapy.Spider):
name = "scraper"
start_urls = ["https://2captcha.com/demo/recaptcha-v2-callback"]
def solve_with_2captcha(self, sitekey, url):
# start the 2CAPTCHA instance
captcha2_api_key = "YOUR_2CAPTCHA_API_KEY"
solver = TwoCaptcha(captcha2_api_key)
try:
# resolve the CAPTCHA
result = solver.recaptcha(sitekey=sitekey, url=url)
if result:
print(f"Solved: {result}")
return result["code"]
else:
print("CAPTCHA solving failed")
return None
except Exception as e:
print(e)
return None
Next, write another function to use the previous solver function. This function passes the response URL and site key to the solver function. It then executes the scraping logic if successful.
class TutorialSpider(scrapy.Spider):
# ...
def solve_captcha(self, response):
# specify reCAPTCHA sitekey, replace with the target site key
captcha_sitekey = "6LfD3PIbAAAAAJs_eEHvoOl75_83eXSqpPSRFJ_u"
# call the CAPTCHA solving function
captcha_solved = self.solve_with_2captcha(captcha_sitekey, response.url)
# check if CAPTCHA is solved and proceed with scraping
if captcha_solved:
print("CAPTCHA solved successfully")
# extract elements after solving CAPTCHA successfully
element = response.css("title::text").get()
print("Scraped element:", element)
Finally, define the parse method and send a callback to the above solver function.Â
class TutorialSpider(scrapy.Spider):
#...
def parse(self, response):
# send a request to solve the CAPTCHA using the solver function as a callback
yield scrapy.Request(url=response.url, callback = self.solve_captcha)
Here's the final code:
# import the required libraries
import scrapy
from twocaptcha import TwoCaptcha
class TutorialSpider(scrapy.Spider):
name = "scraper"
start_urls = ["https://2captcha.com/demo/recaptcha-v2-callback"]
def solve_with_2captcha(self, sitekey, url):
# start the 2CAPTCHA instance
solver = TwoCaptcha("<YOUR_CAPTCHA_2_API_KEY">)
try:
# resolve the CAPTCHA
result = solver.recaptcha(sitekey=sitekey, url=url)
if result:
print(f"Solved: {result}")
return result["code"]
else:
print("CAPTCHA solving failed")
return None
except Exception as e:
print(e)
return None
def solve_captcha(self, response):
# specify reCAPTCHA sitekey
captcha_sitekey = "6LfD3PIbAAAAAJs_eEHvoOl75_83eXSqpPSRFJ_u"
# call the CAPTCHA solving function
captcha_solved = self.solve_with_2captcha(captcha_sitekey, response.url)
# check if CAPTCHA is solved and proceed with scraping
if captcha_solved:
print("CAPTCHA solved successfully")
# extract elements after solving CAPTCHA successfully
element = response.css("title::text").get()
print("Scraped element:", element)
def parse(self, response):
# send a request to solve the CAPTCHA using the solver function as a callback
yield scrapy.Request(url=response.url, callback = self.solve_captcha)
The code solves the reCAPTCHA CAPTCHA successfully and returns a solved code, as shown:
Solved: {
'captchaId': '75653786097',
'code': '03AFcWeA7Ap7jFxiBmNjBbwiHSGjMCD_oP3Ae8cUxzdtqJnNkj4XnuUJOUFRfUkkjU_GPCXwqHYYFCynXdrQhAQce-F...'
}
CAPTCHA solved successfully
Scraped element: How to solve reCAPTCHA V2 Callback on PHP, Java, Python, Go, Csharp, CPP
That's it! You just solved a CAPTCHA with 2CAPTCHA. However, remember that 2CAPTCHA doesn't solve all CAPTCHAs and can be expensive for large-scale projects.
Method #3: Rotate Premium Proxies
Proxy rotation can help bypass CAPTCHAs, but it's less effective than the two previous methods. Some websites limit the number of requests from every IP address and often spin a CAPTCHA for those that exceed their limits.Â
Rotating proxies helps mask your IP address and prevents the server from identifying the request source. Thus, you can scrape the web unnoticed and avoid runtime interruptions due to IP bans.Â
However, ensure you use premium proxies when dealing with CAPTCHAs because the free ones usually don't work. There are also many CAPTCHA-compatible proxies out there.
You can use proxies with Scrapy and also rotate them. Check our full tutorial on using proxies with Scrapy to learn more.
Conclusion
This article has highlighted the various techniques of bypassing CAPTCHAs in Scrapy. You've learned to achieve this with a web scraping API, a CAPTCHA-solving service, and premium proxy rotation.
As mentioned, the best of the three is to use web scraping APIs, and ZenRows comes on top. ZenRows is an all-in-one web scraping solution for bypassing CAPTCHAs and other anti-bot systems. Try ZenRows for free!