The Anti-bot Solution to Scrape Everything? Get Your Free API Key! ๐Ÿ˜Ž

Handle the User Agent in NodeJS

July 20, 2023 ยท 4 min read

When scraping with NodeJS, a default or bad User Agent is one of the most frequent reasons why you get blocked. So here, you'll learn how to set a custom one and randomize it using Axios in NodeJS to ensure the success of your project.

What Is the User-Agent in NodeJS?

The User-Agent header is a string sent with an HTTP request, identifying the client making the request. In NodeJS, you can use a library such as Axios or Selenium to make HTTP requests.

Install Axios with the following command:

Terminal
npm install axios

To see your current UA, send a request to https://httpbin.org/headers and log the User-Agent header using Axios. You can copy-paste the below sample scraper and save it in a file named index.js, for example.

index.js
const axios = require('axios');
 
axios.get('https://httpbin.org/headers')
  .then(response => {
    const headers = response.data.headers;
    console.log(headers['User-Agent']);
  })
  .catch(error => {
    console.log(error);
  });

To run it, use the node command followed by your file name.

axios screenshot
Click to open the image in full screen

As you see, the default NodeJS User Agent using Axios is axios/1.4.0. (it can be slightly different for you depending on the library version you use). This default UA string is easily identifiable by website security measures and will likely get you blocked.

Let's learn how to change it!

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

How to Use a Custom User-Agent in NodeJS

To set a custom User Agent in NodeJS, follow this process:

The first step is to get a real sample, and you can find some reliable ones in our list of UAs for web scraping. Grab a User Agent and put it on an object:

index.js
const headers = {
  'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
};

You'll use this object as the second parameter for the Axios GET call:

index.js
axios.get('https://httpbin.org/headers', { headers })

So this is the complete code that sends a request to https://httpbin.com/headers with a custom User-Agent header and logs the User-Agent header received in the response:

index.js
const axios = require('axios');
 
const headers = {
  'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
};
 
axios.get('https://httpbin.org/headers', { headers })
  .then(response => {
    const headers = response.data.headers;
    console.log(headers['User-Agent']);
  })
  .catch(error => {
    console.log(error);
  });

After running the code, you should see an output similar to the one below:

Logged Random user agent
Click to open the image in full screen

Cool! You've just set a NodeJS fake User Agent.

However, more than a single User-Agent header is necessary to avoid getting blocked because too many consecutive requests with the same one are a hint to detect you as a bot. That's why you should rotate UA headers to use a different one for each request.ย 

Here's an example that sends a request to https://httpbin.org/headers with a randomly selected User Agent header. This code defines an array of User-Agents, chooses one randomly, and then uses a self-invoking async function to request the target website with the selected string.

index.js
const axios = require('axios');
const userAgents = [
  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
  'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15',
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15'
];
 
(async function fetchWithRandomUserAgent() {
  const randomIndex = Math.floor(Math.random() * userAgents.length);
  const randomUserAgent = userAgents[randomIndex];
 
  const options = {
    headers: {
      'User-Agent': randomUserAgent
    }
  };
 
  try {
    const response = await axios.get('https://httpbin.org/headers', options);
    console.log(response.data);
  } catch (error) {
    console.error(error);
  }
})();

The script also logs the random User Agent from the response data to the console, so it worked!

NodeJS Fake User Agent.
Click to open the image in full screen

How to Change the User-Agent in NodeJS at Scale

When doing web scraping, constructing a proper User Agent is crucial to avoid detection and potential blocking. By carefully constructing a User Agent that mimics a typical user, it becomes less likely for the web scraping to be flagged as suspicious or automated. A well-crafted User Agent helps maintain a low profile. Therefore, understanding the importance of constructing an appropriate User Agent is essential for successful and uninterrupted scraping.

Let's check one example of an incomplete and suspicious User Agent:

Mozilla/5.0 (X11; Linux) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15

This User Agent has incomplete information for the Operating System part since it just says it uses the X11 window system running on Linux but doesn't specify what version of Linux or the architecture of the system. Moreover, it claims to be a Safari browser, which is not supported on Linux. Any anti-bot system could easily classify this uncommon User Agent as suspicious.

Additionally, you need to make sure that your UAs stay updated.

Then, besides User-Agent, you need to pay attention to other headers to simulate a real browser:

And yet, not only header configuration will come your way when web scraping. There are other big challenges you might have to face. To learn more about them, check our guide on the anti-scraping techniques you need to know.

Is there an easy solution?

A web scraping API like ZenRows auto-rotates the User Agent in NodeJS, as well as sets the whole header set, and deals with any anti-bot measures.

Sign up to get your free API key and try it for free. You'll get to the Request Builder page:

Request Builder Page
Click to open the image in full screen

With the code on the right, you can start making requests to any webpage you want with User Agent auto-rotation. To verify it works, we used https://httpbin.io/user-agent as a target URL.

User Agent Autorotation
Click to open the image in full screen

As you can see, it's super easy to make requests with ZenRows to get real and working UAs. Additionally, you can use other anti-bot features it has available. We usually recommend activating premium proxies, JavaScript rendering and Anti-bot.

Conclusion

Using different and well-formed User Agents in NodeJS for each request helps prevent you from being blocked by your target website. Additionally, you'll need to deal with other techniques to scrape at scale successfully, for which we recommend checking out our guide on the best methods for web scraping without getting blocked.

If you'd like to implement an easy solution that deals with all anti-bot measures, then you can sign up for free to ZenRows and try it out yourself.

Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

The easiest way to do Web Scraping

From Rotating Proxies and Headless Browsers to CAPTCHAs, a single API call to ZenRows handles all anti-bot bypass for you.