How to Set Up a Proxy in Got? Tutorial [2024]

June 12, 2024 · 6 min read

Got is a popular JavaScript library for HTTP requests in NodeJS, and a great tool for web scraping tasks. But despite all its useful features, it doesn't solve one of the most significant web scraping challenges: getting blocked by websites' anti-bot protection systems.

In this article, you'll learn how to overcome this hurdle by setting up a Got proxy. You’ll go through a step-by-step process of using proxies for web scraping, and learn which proxies bring the best results.

Let’s go!

Step 1: Set up a Proxy With Got

Before setting up a proxy with Got, let's start with a basic script to which you'll add proxy configuration.

This script makes a straightforward HTTP request to https://httpbin.io/ip, a web service that returns the IP address of the requesting client.

scraper.js
import got from 'got';
 
try {
    // make HTTP request 
    const {body} = await got('https://httpbin.io/ip');
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}

Once you have everything set up, you're ready to configure your proxy.

However, Got doesn't support proxies out of the box. You have to integrate with an agent that provides proxy support for HTTP requests in NodeJS applications.

There are different NodeJS proxy agents, but in this tutorial, we'll use hpagent as the most reliable option. hpagent takes the same parameter as the Node.js core's HTTP(s) agent and the proxy option, which allows you to specify your proxy URL.

To set up a proxy with Got, configure hpagent with the necessary parameters (proxy URL) and make your request using the configured agent.

Start by installing hpagent using the following command.

Terminal
npm install hpagent

After that, import the required libraries. Then, define your proxy configuration options using hpagent. It involves creating an instance of HttpsProxyAgent and specifying your proxy URL and other desired settings.

You can grab a free proxy from the Free Proxy List and construct a URL using the following format: <PROXY_PROTOCOL>://<PROXY_IP_ADDRESS>:<PROXY_PORT>. You should also use HTTPS proxies since they work with both HTTPS and HTTP websites.

scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: 'http://20.219.180.149:3129'
        })
    }
};

Lastly, make your request using the configured options.

scraper.js
//...
 
try {
    // make HTTP request
    const {body} = await got('https://httpbin.io/ip', proxyOptions);
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}

Combine everything. Your complete code should look like this:

scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: 'http://20.219.180.149:3129'
        })
    }
};
 
try {
    // make HTTP request
    const {body} = await got('https://httpbin.io/ip', proxyOptions);
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}

Run it, and the result should be your proxy's IP address.

Output
{
  "origin": "20.219.180.149:45310"
}

Well done!

Proxy Authentication

To authenticate a Got proxy, specify your proxy URL using the following format: <PROXY_PROTOCOL>://<YOUR_USERNAME>:<YOUR_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>.

Here's how to modify the previous code to authenticate it.

scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: 'http://<YOUR_USERNAME>:<YOUR_PASSWORD>@20.219.180.149:3129'
        })
    }
};
 
try {
    // make HTTP request
    const {body} = await got('https://httpbin.io/ip', proxyOptions);
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Step 2: Use Rotating Proxies With Got

Websites often flag "excessive requests" as suspicious activity and can block your proxy. To avoid this issue, you must rotate between multiple proxies. This way, your requests appear to originate from different users or devices, increasing your chances of avoiding detection.

Let's see how.

Rotating IPs From a Free Proxy List

To rotate proxies with Got, specify a proxy pool and randomly select a different proxy for each request. Follow the steps below to achieve this.

Change your single proxy to a proxy list, like the one below.

scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
// define a list of proxy URLs
const proxyList = [
  'http://20.219.180.149:3129',
  'http://198.199.70.20:31028',
  'http://8.219.97.248:80',
  // add more proxy URLs as needed
];

Next, create a function that randomly selects a proxy from the list using Math.random().

scraper.js
//...
 
// function to select a random proxy from the list
function getRandomProxy() {
  const randomIndex = Math.floor(Math.random() * proxyList.length);
  return proxyList[randomIndex];
}

In your hpagent configuration, set the proxy option to the getRandomProxy() function. This will specify a random proxy from the list.

scraper.js
//...
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: getRandomProxy()
        })
    }
};

Lastly, make your request using the configured options, like in the initial proxy script.

Your final code should look like this:

scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
// define a list of proxy URLs
const proxyList = [
  'http://20.219.180.149:3129',
  'http://198.199.70.20:31028',
  'http://8.219.97.248:80',
  // add more proxy URLs as needed
];
 
// function to select a random proxy from the list
function getRandomProxy() {
  const randomIndex = Math.floor(Math.random() * proxyList.length);
  return proxyList[randomIndex];
}
 
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: getRandomProxy()
        })
    }
};
 
try {
    // make HTTP request
    const {body} = await got('https://httpbin.io/ip', proxyOptions);
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}

To verify if it works, make multiple requests. You should get a different IP address each time. Here are the results for two requests:

Output
{
  "origin": "8.219.64.236:1416"
}
 
{
  "origin": "20.219.180.149:45310"
}

Nice job!

However, while this approach may work in a tutorial, it will likely fail for most real-world use cases, especially if you plan to scrape at scale.

See for yourself. Try to scrape the G2 review page below using the Got proxy script.

G2 Review Page
Click to open the image in full screen
scraper.js
// import the required modules
import got from 'got';
import {HttpsProxyAgent} from 'hpagent';
 
// define a list of proxy URLs
const proxyList = [
  'http://20.219.180.149:3129',
  'http://198.199.70.20:31028',
  'http://8.219.97.248:80',
  // add more proxy URLs as needed
];
 
// function to select a random proxy from the list
function getRandomProxy() {
  const randomIndex = Math.floor(Math.random() * proxyList.length);
  return proxyList[randomIndex];
}
 
const proxyOptions = {
    agent: {
        // create a new HttpsProxyAgent instance
        https: new HttpsProxyAgent({
            // add proxy settings
            keepAlive: true,
            keepAliveMsecs: 1000,
            maxSockets: 256,
            maxFreeSockets: 256,
            scheduling: 'lifo',
            // specify proxy URL.
            proxy: getRandomProxy()
        })
    }
};
 
try {
    // make HTTP request
    const {body} = await got('https://www.g2.com/products/visual-studio/reviews', proxyOptions);
    // log the response
    console.log(body);
} catch (error) {
    console.error(error);
}

You'll get an error message similar to the one below.

Output
HTTPError: Response code 403 (Forbidden)

As you can see, free proxies can't help you avoid being blocked by advanced anti-bot measures. To address this, let's explore premium proxies.

Using Premium Proxies

Premium proxies offer a significant advantage over free proxies. They streamline and automate the scraping process, providing a more automated and efficient approach. Additionally, premium proxies like ZenRows are equipped with full anti-bot bypass features, enabling you to scrape without getting blocked.

To learn more about premium proxies, check out our list of the best web scraping proxy services.

In the meantime, let's see how premium proxies work, using ZenRows as an example.

To use ZenRows, sign up for a free trial. You'll be directed to the Request Builder page.

Paste your target URL, select the JavaScript Rendering mode, and check the box for Premium Proxies to rotate proxies automatically. Select NodeJS as the language, and it'll generate your request code on the right.

ZenRows Request Builder
Click to open the image in full screen

You'll see that the Axios library is suggested, but feel free to switch to Got. You only need to replace the Axios import with Got, and use the Got function to make your request to the ZenRows API endpoint.

Your new script should look like this:

scraper.js
// import the required module
import got from 'got';
 
const url = 'https://www.g2.com/products/visual-studio/reviews';
const apikey = '<YOUR_ZENROWS_API_KEY>';
 
(async () => {
    try {
        const {body} = await got('https://api.zenrows.com/v1/', {
            searchParams: {
                url: url,
                apikey: apikey,
                js_render: 'true',
                premium_proxy: 'true',
            }
        });
        console.log(body);
    } catch (error) {
        console.error('Error:', error.response.body);
    }
})();

Run it, and you'll get the page's HTML content.

Output
<title>Visual Studio Reviews 2024: Details, Pricing, &amp; Features | G2</title>
 
//...

Awesome, right? ZenRows makes scraping easy.

Conclusion

Setting a Got proxy can help you hide your IP address and avoid direct IP bans. However, it's important to remember that free proxies don't work in real-world cases. You need premium proxies, such as those offered by ZenRows, for at-scale web scraping. You'll save yourself the trouble of finding and configuring proxies, and confidently bypass any anti-bot measures thrown your way.

Ready to get started?

Up to 1,000 URLs for free are waiting for you