The Anti-bot Solution to Scrape Everything? Get Your Free API Key! ūüėé

NodeJS: Retry Failed Requests

January 10, 2024 · 6 min read

Experiencing failed requests? We've all been there. Network issues, server downtimes, or unexpected errors can disrupt client-server connections. Luckily, implementing NodeJS retry strategies offers an active solution.

In this tutorial, you'll learn how to implement NodeJs retry mechanisms that automatically attempt to recover from such failures.

What to Know to Build the NodeJS Requests' Retry Logic

Building a robust NodeJS retry logic requires an understanding of various concepts. From identifying retry-worthy scenarios to setting retry limits and delays, below are key considerations to note when implementing retry mechanisms. 

Types of Failed Requests

Request failures can be classified into timeout errors and cases where the server returns an error. Let's look at both scenarios to identify which is retry-worthy.

Timed Out

Temporary issues, such as overloaded servers, network errors, internet speed, or momentary downtime, can result in timeout errors as the server takes too long to respond. These failures are typically indicated by the `ECONNABORTED` error keyword, which means the connection was aborted due to a timeout. 

These scenarios are retry-worthy as resending your requests will often eliminate the error. However, doing that manually is inefficient when you can implement a logic to handle it for you. A retry strategy can involve setting timeouts to fail fast and retry, ensuring your application doesn't wait indefinitely for a response. 

Returned Error

When a server encounters an issue or is incapable of fulfilling a request, it responds with an HTTP status code and/or error message to indicate the nature of the problem. Understanding these codes enables you to identify whether they're due to client errors, such as invalid requests/authentication or temporary server issues. 

If your error code indicates an issue that's more than likely to resolve itself, you can implement a NodeJS retry mechanism to fix it. 

Let's look at some common status codes and what they mean.

Error Status Codes

HTTP error status codes provide a standard way to convey the outcome of a request. They typically fall in the 4xx and 5xx ranges. 4xx codes represent client errors, while those in the 5xx range indicate server-related problems. 

Below are some of the most common ones you're likely to encounter when web scraping:

  • 403 Forbidden: The server understands the request but won't fulfill it because the client does not have the necessary permissions to access the requested resource. See our guide about resolving 403 errors when web scraping.
  • 429 Too Many Requests: The client has exceeded the rate limit imposed by the server.
  • 500 Internal Server Error: A generic server error message indicating an unexpected condition on the server.
  • 502 Bad Gateway: While acting as a gateway or proxy, the server received an invalid response from an upstream server.
  • 503 Service Unavailable: The server can't handle the request at the time. This commonly occurs during maintenance or temporary overloads.

Number of Retries

Another notable consideration is why and when to use retry limits. Generally, setting minimum and maximum retries is a common best practice. It prevents your application from entering into an indefinite loop of retry attempts that'll result in performance and blocking issues. 

However, deciding on the optimal number of retries is a complex task, as there is no universally defined 'best' number. Instead, it requires a consideration of various factors, such as the type of failed requests and response time. 

For instance, when dealing with timeout errors, the number of retries should align with the typical response time of the requests. As a general guideline, a reasonable starting point is often three to five retries.

Delay

Setting delays between retries is one way to avoid potential issues such as rate limiting, server overload, and triggering anti-bot mechanisms. Also, delays allow the server time to fix itself when dealing with transient server errors, increasing the chance of success on a retry. 

A best practice in implementing delays involves adopting an exponential backoff strategy, where the delay gradually increases with subsequent retries. This measured approach helps prevent overwhelming the server with too many requests too soon. You'll learn how to implement this strategy in the next section.

Tutorial: How to Retry in NodeJS

How you manage retries depends on the HTTP client used, as they have varying levels of built-in support for retry logic and may require different approaches. 

For this tutorial, we'll make HTTP requests using Axios and implement retry strategies with the NodeJS Retry module, an intuitive API for exponential retry strategies with configurable retry settings.

Ready? Let's dive in.

Step 1: Initial Setup

We'll be retrying a potentially failing HTTP request, so start by installing Axios and the NodeJS retry library.

Terminal
npm install retry axios

Next, import the necessary modules (Retry and Axios). Then, open a function that contains the logic for making fault-tolerant HTTP requests. 

program.js
const axios = require('axios');
const retry = require('retry');
 
function faultTolerantHttpRequest(URL, cb) {
  // retry logic will be implemented here
}

Your function should take two arguments:

  • URL (type: string): The URL to query.
  • cb (type: function): The callback function to handle the outcome.

Step 2: Initialize Retry Operation with Custom Settings

Initialize a new RetryOperation object within your function with custom settings. 

program.js
//..
    // Initialize a retry operation with custom settings
    const operation = retry.operation({
    retries: 5,            // Maximum number of retry attempts
    factor: 2,             // Exponential backoff factor
    minTimeout: 1000,      // Minimum timeout (in milliseconds)
    maxTimeout: 60 * 1000, // Maximum timeout (in milliseconds)
    randomize: true,       // Randomize the timeouts
  });

In the code snippet above, the retry options include the following settings:

  • retries specifies the maximum number of retry attempts.
  • factor sets the exponential backoff factor, determining the time delay between retry attempts. This will be further explained in subsequent steps.
  • minTimeout defines the minimum timeout between retries in milliseconds.
  • maxTimeout sets the maximum timeout between retries, preventing excessively long delays.
  • randomize adds randomness to the timeouts, preventing synchronized retries and reducing the load on the server.

The Retry API offers numerous settings to enable you to implement a logic that aligns with your application's requirements. Check out the documentation for more details. 

Tailoring these settings to the specific needs of your application is crucial for achieving an effective and efficient retry mechanism. So, adjust the values based on factors like expected response times and the nature of potential failures.

Step 3: Exponential Backoff Strategy

Exponential backoff is a retry strategy in which the delays between successive retry attempts increase exponentially. This means that subsequent retries wait longer than the previous one. The core idea of this approach is to reduce server load and increase success probability over time.

In the code snippet in step 2, the factor parameter is set to 2, indicating that each retry attempt will wait approximately twice as long as the previous one. This strategy allows for a more resilient and fault-tolerant system.

Step 4: Defining the Retry Attempt

Now that we've introduced the exponential backoff strategy, it's time to define and execute the operation that will be retried in the event of a transient error. 

Start by calling Retry's operation.attempt() function. 

program.js
//..
 
    operation.attempt(function(currentAttempt) {
    // Retry attempt logic goes here
 
    });

This function executes your request for the first time right away and handles retries if it fails. It takes a CurrentAttempt callback that represents the number of attempts made so far.

Then, within the above function, define your operation and handle the errors. 

For example, we'll make a GET request to a Httpbin error page. In this case, check if the request is successful and resolve the Promise with the response data. 

Next, catch possible errors and check if a retry is allowed using operation.retry(error). If true, return early and initiate a retry. 

If all retry attempts are exhausted, reject with the main error. Here, the callback function is invoked with the main error. This indicates that the operation could not be completed after the specified number of retries.

program.js
//..
 
    operation.attempt(function(currentAttempt) {
        // Make an HTTP GET request to http://httpbin.io/status/500
        axios.get(URL)
            .then((response) => {
                // If the request is successful, resolve the Promise with the response body
                cb(null, response.data);
            })
            .catch((error) => {
                // If there's an error, check if retry is allowed.
                if (operation.retry(error)) {
                    return;
                }
 
                // If all retry attempts are exhausted, reject with the main error
                cb(operation.mainError());
            });
    });

The operation.retry(error) method handles the retry decision based on the defined custom settings. If a retry is allowed, the function ends to initiate an attempt. Each retry will be scheduled with an increasing timeout according to the exponential backoff factor in the RetryOperation object.

Step 5: Invoke the Function

Lastly, Invoke the faultTolerantHttpRequest() function with the necessary parameters. Recall that this function takes two arguments: the URL and the callback function.

program.js
faultTolerantHttpRequest('https://httpbin.io/status/500', function(error, result) {
  if (error) {
    // Handle the error
    console.error('Error:', error);
  } else {
    // Process the successful result
    console.log('Result:', result);
  }
});

The callback function takes two parameters: error and result. It checks for errors and handles them accordingly. If there's no error, it processes the successful result.

Putting all the steps together, you have the following complete code.

program.js
const axios = require('axios');
const retry = require('retry');
 
function faultTolerantHttpRequest(URL, cb) {
    // Initialize a retry operation with custom settings
    const operation = retry.operation({
    retries: 5,            // Maximum number of retry attempts
    factor: 2,             // Exponential backoff factor
    minTimeout: 1000,      // Minimum timeout (in milliseconds)
    maxTimeout: 60 * 1000, // Maximum timeout (in milliseconds)
    randomize: true,       // Randomize the timeouts
});
 
    operation.attempt(function(currentAttempt) {
        // Make an HTTP GET request to http://httpbin.io/status/500
        axios.get(URL)
            .then((response) => {
                // If the request is successful, resolve the Promise with the response body
                cb(null, response.data);
            })
            .catch((error) => {
                // If there's an error, check if retry is allowed.
                if (operation.retry(error)) {
                    return;
                }
 
                // If all retry attempts are exhausted, reject with the main error
                cb(operation.mainError());
            });
    });
};
 
faultTolerantHttpRequest('https://httpbin.io/status/500', function(error, result) {
  if (error) {
    // Handle the error
    console.error('Error:', error);
  } else {
    // Process the successful result
    console.log('Result:', result);
  }
});
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Other Libraries to Retry in NodeJS

We've seen how to implement retry mechanisms using the NodeJS Retry module. Let's explore alternative solutions necessary for other use cases or requirements. 

Retry with HTTP Client Axios 

The Axios-retry plugin provides an easy way to retry failed Axios requests with interceptors. It offers a preconfigured interceptor that handles retries automatically based on predefined configurations.

For a step-by-step tutorial on this approach, check out our guide on retrying failed requests with Axios.

Retry with HTTP Client Fetch 

The Fetch API also allows you to implement a custom retry mechanism by creating a function that validates fetch requests and incorporates a retry logic accordingly. Depending on your project needs, you can include max retries and delays after each attempt. 

Check out its  API documentation for more information.

Retry with HTTP Client Got 

Like the Fetch HTTP client, you can implement custom retry mechanisms using a loop with "number of retries" iterations to make requests and handle errors. 

Refer to its documentation to learn more.

Avoid Getting Blocked with Retry in NodeJS

While handling transient errors is essential, getting blocked by anti-bot systems remains the biggest web scraping challenge. Websites use various techniques to identify web clients and block non-browser requests. So, your retry logic might still be denied access. 

Let's scrape a G2 product review page. This page uses advanced Cloudflare protection. To do this, change the argument of the faultTolerantHttpRequest() function.

program.js
//..
 
faultTolerantHttpRequest('https://www.g2.com/products/notion/reviews', function(error, result) {
  if (error) {
    // Handle the error
    console.error('Error:', error);
  } else {
    // Process the successful result
    console.log('Result:', result);
  }
});

If you run the code, you'll get an error message like the one below.

program.js
<!-- // ... -->
   <div class="cf-main-wrapper" role="main">
      <div class="cf-header cf-section">
         <div class="cf-error-title">
            <h1>Access denied</h1>
            <span class="cf-code-label">Error code <span>1020</span></span>
         </div>
         <div class="cf-error-description">
            <p>You do not have access to www.g2.com.</p><p>The site owner may have set restrictions that prevent you from accessing the site.</p>
         </div>
      </div>
   </div>
 
<!-- // ... -->

One popular quick fix for scenarios like this is using web scraping proxies to hide your IP address. While this might work for some cases, it isn't enough for many websites, especially those using advanced anti-bot measures.

Luckily, web scraping APIs like ZenRows offer everything you need to avoid getting blocked, including premium proxies, rotating headers, and much more. Also, ZenRows easily integrates with NodeJS, and you can leverage all its functionalities from a single API call.

Let's scrape the same G2 product page using ZenRows. 

Start by signing up for free, and you'll get to the Request Builder page. 

Paste your target URL (https://www.g2.com/products/notion/reviews), turn on the AI Anti-bot boost mode, enable premium proxies, and choose Node.js, like in the image below.

ZenRows Request Builder Page
ZenRows Request Builder Page

On the right of your dashboard, you'll see a ready-to-use scraper code that uses Axios, although you can use any NodeJS HTTP client of your choice. 

program.js
// npm install axios
import axios from 'axios';
 
const url = 'https://www.g2.com/products/notion/reviews';
const apikey = '<YOUR_ZENROWS_API_KEY>';
axios({
    url: 'https://api.zenrows.com/v1/',
    method: 'GET',
    params: {
        'url': url,
        'apikey': apikey,
        'js_render': 'true',
        'antibot': 'true',
        'premium_proxy': 'true',
    },
})
    .then(response => console.log(response.data))
    .catch(error => console.log(error));

Run this code, and you'll get the page's HTML content.

Output
<!DOCTYPE html>
 
#...
 
<title>Notion Reviews 2023: Details, Pricing, &amp; Features | G2</title>
 
#...

Awesome right? That's how easy it is to bypass anti-bot systems with ZenRows.

Conclusion 

Implementing retry mechanisms is a valuable technique for handling transient errors in NodeJS. Moreover, introducing delays between retries reduces server load and increases success probability over time. That's where the exponential backoff strategy plays a key role.

However, despite the reliability of your retry logic, you might still get blocked by anti-bot systems. Consider adding ZenRows to your workflow to handle HTTP requests better and scrape without getting blocked. Sign up to try ZenRows for free.

Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

The easiest way to do Web Scraping

From Rotating Proxies and Headless Browsers to CAPTCHAs, a single API call to ZenRows handles all anti-bot bypass for you.