Struggling with failed requests in Python? The Requests library makes HTTP calls easy, but errors like timeouts or bad responses can still get in the way.
Python Requests Retry Logic Explained
Retrying failed requests means having your code automatically resend them when something goes wrong. But not all errors should trigger a retry—it's usually best for temporary issues like timeouts or server errors.
Retrying after things like 404s or authentication failures isn’t helpful and can even get you blocked.
Exponential Backoff Strategy for the Delays
Exponential backoff is nothing but a technique for increasing delays between retries instead of fixed or random ones.
The backoff algorithm is as simple as the following:
backoff_factor * (2 ** (current_number_of_retries - 1))
The backoff factor is multiplied by 2 and raised to the power of the retry count minus 1. For example, here are the delay sequences for backoff factors 2, 3, and 10:
# 2
1, 2, 4, 8, 16, 32, 64, 128
# 3
1.5, 3, 12, 24, 48, 96, 192, 384
# 10
5, 10, 20, 40, 80, 160, 320, 640
So, how often should you retry failed Python requests during an exponential backoff?
Ensure your retry frequency remains reasonable. It's good practice to set fewer retries to avoid overwhelming the server with indefinite requests.
Errors like "429 Too Many Requests" are temporary and should have more retries than those that aren't. A common approach is to set a maximum number of retries (e.g., 3 to 5) based on the specific site's response patterns and your scraping objectives.
Later in this article, you'll use this knowledge to create your request retry strategy. But first, you need to understand the types of failed requests.
Types of Failed Requests
Understanding the reasons behind a failed request will allow you to develop mitigation strategies for each case. Generally, there are two categories of failed requests:
- Requests that timed out due to server problems (there's no HTTP response from the server).
- Requests that returned an HTTP error.
Let's quickly see each one.
Requests That Timed Out
A timeout occurs when the server doesn't return an HTTP response. It could be due to server overload, server response issues, or a slow network. When faced with timeout scenarios, consider checking your internet connection. A stable connection may suggest that the problem is server-related.
In Python, you can catch exceptions related to timeouts, such as requests.exceptions.Timeout
, and implement a Python retry mechanism conditionally or with strategies like exponential backoff.
Requests That Returned an HTTP Error
In this case, an HTTP client establishes a connection with the server but receives an HTTP error message and status detailing what went wrong. This scenario indicates that the server is active, but the request cannot be processed successfully. For instance, this could be a 403 forbidden error:
403 Client Error: Forbidden for url: https://www.scrapingcourse.com/cloudflare-challenge
Your first approach to addressing this issue is to review the HTTP status code and error message while ensuring the request is formed correctly. If you suspect the error results from a temporary problem or server issues, you may retry the request cautiously.
HTTP Status Error Codes for Failed Python Requests
The different errors in client-server communications are in the 4xx and 5xx code ranges. They include:
- 400 Bad Request.
- 401 Unauthorized.
- 403 Forbidden.
- 404 Not Found.
- 405 Method Not Allowed.
- 408 Request Timeout.
- 429 Too Many Requests.
- 500 Internal Server Error.
- 501 Not Implemented.
- 502 Bad Gateway.
- 503 Service Unavailable.
- 504 Gateway Timeout.
- 505 HTTP Version Not Supported.
The most common ones you'll see while web scraping are:
Error Code | Explanation |
---|---|
403 Forbidden | The server understands the request but won't fulfill it because it doesn't have the required permissions or access. |
429 Too Many Requests | The server has received too many requests from the same IP within a given time frame, so it's rate-limiting in web scraping. |
500 Internal Server Error | A generic server error occurred, indicating that something went wrong on the server while processing the request. |
502 Bad Gateway | The server acting as a gateway or proxy received an invalid response from an upstream server. |
503 Service Unavailable | The server is too busy or undergoing maintenance and can't handle the request at the moment. |
504 Gateway Timeout | An upstream server didn't respond quickly enough to the gateway or proxy. |
You can check out the MDN docs for more information on HTTP response status codes.
While we've provided an overview of some retry concepts, we'll dive into more detail in the following section.
Method 1: Retry Python Requests With HTTPAdapter
Python Requests uses the urllib3 HTTP client under the hood. You can set up retries in Python with Requests' HTTP adapter class and the Retry utility class from the urllib3 package. The HTTPAdapter class lets you specify a retry strategy and change the request behavior.
To implement a simple Python Requests retry logic with the HTTPAdapter
class, import the required libraries and define your options. In this example, we set the maximum number of requests to 4 and only reattempt if the error has a status code of either 403, 429, 500, 502, 503, or 504:
# pip3 install requests
import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry
# define the retry strategy
retry_strategy = Retry(
total=4, # maximum number of retries
status_forcelist=[
403,
429,
500,
502,
503,
504,
], # the HTTP status codes to retry on
)
Pass the retry strategy to the HTTPAdapter
in a new adapter object. Then, mount the adapter to a session object and use it for all requests:
# ...
# create an HTTP adapter with the retry strategy and mount it to the session
adapter = HTTPAdapter(max_retries=retry_strategy)
# create a new session object
session = requests.Session()
session.mount("http://", adapter)
session.mount("https://", adapter)
# make a request using the session object
response = session.get("https://www.scrapingcourse.com/cloudflare-challenge")
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status {response.status_code}")
You've now implemented a simple retry logic using the HTTPAdapter. Let's add a backoff factor.
Sessions and HTTPAdapter With a Backoff Strategy
To set increasing delays between retries with the backoff strategy, add the backoff_factor
parameter to the retry wrapper:
# ...
retry_strategy = Retry(
# ...
backoff_factor=2, # exponential backoff factor
# ...
)
# ... mount the adapter and define your scraping logic
Combine all the snippets, and the final code looks like this:
# pip3 install requests
import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry
# define the retry strategy
retry_strategy = Retry(
total=4, # maximum number of retries
backoff_factor=2,
status_forcelist=[
429,
500,
502,
503,
504,
], # the HTTP status codes to retry on
)
# create an HTTP adapter with the retry strategy and mount it to the session
adapter = HTTPAdapter(max_retries=retry_strategy)
# create a new session object
session = requests.Session()
session.mount("http://", adapter)
session.mount("https://", adapter)
# make a request using the session object
response = session.get("https://www.scrapingcourse.com/ecommerce")
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status {response.status_code}")
You just added exponential backoff to your scraper using the HTTPAdapter.Â
Interested in building a custom retry strategy instead? Keep reading in the next section.
Method 2: Code Your Own Retry Wrapper
Unlike the previous option, we'll create a custom wrapper for the retry logic. Doing it yourself lets you implement custom error handlers, logs, and more.
To keep it straightforward, let's create a Python function (retry_request
) to simulate the retry logic implementation of method 1.
The function accepts the target URL as its first argument, the maximum retries (total
), and status_forcelist
to specify the type of errors to retry the request:
# pip3 install requests
import requests
# define the retry strategy
def retry_request(
url,
total=4,
status_forcelist=[
403,
429,
500,
502,
503,
504,
],
**kwargs,
):
# store the last response in an empty variable
last_response = None
# implement retry
for _ in range(total):
try:
response = requests.get(url, **kwargs)
if response.status_code in status_forcelist:
# track the last response
last_response = response
# retry request
continue
else:
return response
except requests.exceptions.ConnectionError:
pass
# log the response after the retry
return last_response
response = retry_request("https://www.scrapingcourse.com/ecommerce")
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status {response.status_code}")
Now, let's improve the above code with an exponential backoff.
Retry Python Requests With a Custom Backoff Strategy
To retry Python Requests with a custom backoff, take the previous code as a base. Then, create a separate function named backoff_delay
to calculate the delay:
# ...
# define a backoff function
def backoff_delay(backoff_factor, attempts):
# backoff algorithm
delay = backoff_factor * (2 ** (attempts - 1))
return delay
Update the previous code with the backoff factor and implement the exponential delay with the time module:
# ...
from time import sleep
# ...
# define the retry strategy
def retry_request(
# ...,
backoff_factor=2,
# ...,
):
# ...
# implement retry
for attempt in range(total):
try:
# ...
if response.status_code in status_forcelist:
# implement backoff
delay = backoff_delay(backoff_factor, attempt)
sleep(delay)
print(f"retrying in {delay} seconds")
# ...
else:
# ...
except requests.exceptions.ConnectionError:
# ...
# ...
Here's the complete code after combining the snippets:
# pip3 install requests
import requests
from time import sleep
# define a backoff function
def backoff_delay(backoff_factor, attempts):
# backoff algorithm
delay = backoff_factor * (2 ** (attempts - 1))
return delay
# define the retry strategy
def retry_request(
url,
backoff_factor=2,
total=4,
status_forcelist=[
403,
429,
500,
502,
503,
504,
],
**kwargs,
):
# store the last response in an empty variable
last_response = None
# implement retry
for attempt in range(total):
try:
response = requests.get(url, **kwargs)
if response.status_code in status_forcelist:
# implement backoff
delay = backoff_delay(backoff_factor, attempt)
sleep(delay)
print(f"retrying in {delay} seconds")
# track the last response
last_response = response
# retry request
continue
else:
return response
except requests.exceptions.ConnectionError:
pass
# log the response after the retry
return last_response
response = retry_request("https://www.scrapingcourse.com/ecommerce")
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status {response.status_code}")
Are you still getting blocked despite your backoffs? Let's solve that quickly in the next section!
Avoid Getting Blocked by Error 403 With Python Requests
Getting blocked is the biggest problem in web scraping. Some websites may block your IP address or use anti-bot measures like Cloudflare to prevent you from accessing the site if you are detected as a bot.
Unfortunately, delaying requests, retrying, or using exponential backoffs are usually insufficient to stop these anti-bot mechanisms.
To prove this, let's attempt to scrape the Antibot Challenge page using the previous scraper:
# pip3 install requests
import requests
from time import sleep
# define a backoff function
def backoff_delay(backoff_factor, attempts):
# backoff algorithm
delay = backoff_factor * (2 ** (attempts - 1))
return delay
# define the retry strategy
def retry_request(
url,
backoff_factor=2,
total=4,
status_forcelist=[
403,
429,
500,
502,
503,
504,
],
**kwargs,
):
# store the last response in an empty variable
last_response = None
# implement retry
for attempt in range(total):
try:
response = requests.get(url, **kwargs)
if response.status_code in status_forcelist:
# implement backoff
delay = backoff_delay(backoff_factor, attempt)
sleep(delay)
print(f"retrying in {delay} seconds")
# track the last response
last_response = response
# retry request
continue
else:
return response
except requests.exceptions.ConnectionError:
pass
# log the response after the retry
return last_response
response = retry_request("https://www.scrapingcourse.com/antibot-challenge")
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status {response.status_code}")
The above code retried the request four times but failed with a 403 forbidden error despite implementing exponential backoff:
retrying in 1.0 seconds
retrying in 2 seconds
retrying in 4 seconds
retrying in 8 seconds
FAILED with status 403
You can use web scraping proxies to solve the above scenario. However, proxies alone aren't sufficient, and specific libraries and customizations are unlikely to keep up with evolving anti-bot systems.
So, how can you bypass the 403 forbidden error more efficiently?
Many developers solve the problem with web scraping solutions, such as the ZenRows' Universal Scraper API. With a single API call, this tool provides the complete toolkit required to bypass any anti-bot measure. It features premium proxy rotation, anti-bot auto-bypass, JavaScript rendering support, and more.
With a 99.93% average success rate, the
 provides free automatic request retries following best practices. This ensures you get all the data you want at scale without limitations.
You can integrate the ZenRows' Universal Scraper API seamlessly with Python Requests. Let's see it in action!
Sign up on ZenRows to open the Request Builder. Paste your target URL in the link box and activate Premium Proxies and JS Rendering.

Then, select Python as your programming language and choose the API connection mode. Copy the generated Python code and paste it into your scraper.
The generated Python code should look like this:
# pip install requests
import requests
url = "https://www.scrapingcourse.com/antibot-challenge"
apikey = "<YOUR_ZENROWS_API_KEY>"
params = {
"url": url,
"apikey": apikey,
"js_render": "true",
"premium_proxy": "true",
}
response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)
The above code bypasses the anti-bot challenge and outputs the protected site's full-page HTML:
<html lang="en">
<head>
<!-- ... -->
<title>Antibot Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Antibot challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Nice job 🎉! You just used Python's Requests and ZenRows' scraper API to bypass anti-bot protection.
Recommended: Retry Python Requests With a Decorator
Using a decorator to implement retries offers a cleaner approach. You can easily apply the Python Requests retry logic in the decorator to multiple methods or functions.
Instead of coding the decorator yourself, you can use Tenacity, a community-maintained package that simplifies adding retry behavior to requests.
Start by installing Tenacity:
pip3 install tenacity
The retry
decorator from Tenacity takes in arguments like stop
for the maximum number of retries and wait
for details, among others. Feel free to learn more from the official Tenacity documentation.
Configure the retry
decorator as shown:
# pip3 install requests tenacity
# ...
from tenacity import retry, stop_after_attempt, wait_exponential
# define the retry decorator
@retry(
stop=stop_after_attempt(4), # maximum number of retries
wait=wait_exponential(multiplier=5, min=4, max=5), # exponential backoff
)
Now, place your scraper function directly under the decorator. Update the above snippet, and you'll get this complete code:
# pip3 install requests tenacity
import requests
from tenacity import retry, stop_after_attempt, wait_exponential
# define the retry decorator
@retry(
stop=stop_after_attempt(4), # maximum number of retries
wait=wait_exponential(multiplier=5, min=4, max=5), # exponential backoff
)
def scraper(url):
response = requests.get(url)
return response
# target url
url = "https://www.scrapingcourse.com/ecommerce"
try:
response = scraper(url)
if response.status_code == 200:
print(f"SUCCESS: {response.text}")
else:
print(f"FAILED with status code: {response.status_code}")
except requests.RequestException as e:
print(f"FAILED: {e}")
The above code uses Tenacity's retry decorator to control your Python request retry logic.
Best Practices for Python Requests Retry
While implementing request retries, here are the best practices you should follow to build an efficient web scraper:
- Apply timeouts and fallbacks appropriately to prevent indefinite request retries.
- Use reasonable request retry attempts (e.g., 3 to 5) and avoid fixed delays.Â
- Retry failed requests for specific error types.Â
- Adhere to server retry instructions, such as the
[Retry-After](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After)
message in the response headers. - Apply asynchronous support with libraries like aiohttp and asyncio to boost performance.Â
- Use sessions to reuse connections and improve subsequent load times.
Conclusion
Handling failed requests is critical to building a robust and reliable web scraper. In this tutorial, we showed you the importance of retrying failed requests and how to apply retry logic to your Python scraper using third-party libraries and custom methods with the Requests library.Â
Now you know:
- The most essential Python Requests retry logic considerations.
- The two best options for retries.
- How to retry requests for different HTTP methods.
Remember that most websites use anti-bot measures to prevent you from scraping their data. To overcome this barrier and scrape any website at scale without getting blocked, we recommend using a web scraping API like ZenRows.Â
Try ZenRows for free now without a credit card!