Web Crawling Webinar for Tech Teams
Web Crawling Webinar for Tech Teams

Axios Pagination: How to Scrape Multiple Pages

Idowu Omisola
Idowu Omisola
January 31, 2025 · 3 min read

Do you want to leverage the power of Axios to scrape paginated websites? You're in the right place.

In this article, you'll learn how the different pagination types work and how to extract data from multiple pages using Axios in Node.js.

Let's get started!

Scrape With a Navigation Bar In Axios

Websites with a navigation bar require users to click a page number or next button to navigate different pages. To scrape this pagination type, you can write your Axios web scraping logic to follow the next page link continuously until the last page.

An example of such a site is this E-commerce Challenge page, which distributes product data over 12 pages:

Navigation bar demo
Click to open the image in full screen

To build your Axios pagination scraper, you'll extract product names, prices, and image URLs from all the pages of the above website. Let's begin with a basic scaper that extracts the target product data from the first page.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Create a scraper function that requests the target website and extracts the product container elements using Cheerio. Iterate through the product containers using a forEach loop and extract the product data:

scraper.js
// npm install axios cheerio
const axios = require('axios');
const cheerio = require('cheerio');

const url = 'https://www.scrapingcourse.com/ecommerce/';

async function scraper(url) {
    // initialize a product array to collect scraper data
    const products = [];

    try {
        // request to the target URL
        const response = await axios.get(url);

        // validate the request
        if (response.status !== 200) {
            console.error(`Status failed with ${response.status}`);
            return products;
        }

        // parse the HTML content
        const $ = cheerio.load(response.data);

        // obtain the main product cards
        const productCards = $('.product');

        // iterate through product cards to retrieve product names and prices
        productCards.each((index, product) => {
            const productInfo = {
                Name: $(product).find('.product-name').text().trim(),
                Price: $(product).find('.product-price').text().trim(),
                'Image URL': $(product).find('.product-image').attr('src'),
            };
            products.push(productInfo);
        });
    } catch (error) {
        console.error(`Error occurred: ${error.message}`);
    }

    // return the collected product data
    return products;
}

// execute the scraper function and log the result
(async () => {
    const allProducts = await scraper(url);
    console.log(allProducts);
})();

The above code extracts product data from the first page only:

Output
[
    {
        Name: 'Abominable Hoodie',
        Price: '$69.00',
        'Image URL': 'https://www.scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/mh09-blue_main.jpg',
    },
    //... 14 products omitted for brevity,
    {
        Name: 'Artemis Running Short',
        Price: '$45.00',
        'Image URL': 'https://www.scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/wsh04-black_main.jpg',
    }
]

Your scraper now extracts data from the target site. However, it scrapes only the first product page because you've not implemented pagination.

To ensure you get content from all pages, you'll need to modify the code to follow the next page link and extract the target data from each page. First, let's view the next page link element in the developer console:

scrapingcourse ecommerce homepage devtools
Click to open the image in full screen

The next page link element is inside an a tag with the class name next page-numbers.

Now, update the previous scraper to extract the next page URL. Implement a logic to check if that element still exists in the DOM and recursively call the scraper function to visit the returned URL. The execution stops once the next page element is no longer in the DOM. Finally, push new data to the products array:

scraper.js
// ...

async function scraper(url) {
    // ...
    try {
        // ...

        // get the next page link
        const nextLink = $('.next').attr('href');

        // check if the next page exists and call the scraper function recursively if so
        if (nextLink) {
            // check if the link is relative and create the full URL
            const nextPageUrl = nextLink.startsWith('http')
                ? nextLink
                : baseUrl + nextLink;
            // recursive the scraper function
            const nextProducts = await scraper(nextPageUrl);
            // merge the results from the next page
            products.push(...nextProducts);
        }
    } catch (error) {
        // ...error handling
    }

    // ...
}

// ...

Merge the above snippet with the previous scraper, and you'll get the following complete code:

scraper.js
// npm install axios cheerio
const axios = require('axios');
const cheerio = require('cheerio');

const url = 'https://www.scrapingcourse.com/ecommerce/';

async function scraper(url) {
    // initialize a product array to collect scraped data
    const products = [];

    try {
        // request to the target URL
        const response = await axios.get(url);

        // validate the request
        if (response.status !== 200) {
            console.error(`Status failed with ${response.status}`);
            return products;
        }

        // parse the HTML content
        const $ = cheerio.load(response.data);

        // obtain the main product cards
        const productCards = $('.product');

        // iterate through product cards to retrieve product names and prices
        productCards.each((index, product) => {
            const productInfo = {
                Name: $(product).find('.product-name').text().trim(),
                Price: $(product).find('.price').text().trim(),
                'Image URL': $(product).find('.product-image').attr('src'),
            };
            products.push(productInfo);
        });

        // get the next page link
        const nextLink = $('.next').attr('href');

        // check if the next page exists and call the scraper function recursively if so
        if (nextLink) {
            // check if the link is relative and create the full URL
            const nextPageUrl = nextLink.startsWith('http')
                ? nextLink
                : baseUrl + nextLink;
            // recursive the scraper function
            const nextProducts = await scraper(nextPageUrl);
            // merge the results from the next page
            products.push(...nextProducts);
        }
    } catch (error) {
        console.error(`Error occurred: ${error.message}`);
    }

    // return the collected product data
    return products;
}

// execute the scraper function and log the result
(async () => {
    const allProducts = await scraper(url);
    console.log(allProducts);
})();

The above code now extracts product data from all 12 pages:

Output
[
    {
        Name: 'Abominable Hoodie',
        Price: '$69.00',
        'Image URL': 'https://www.scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/mh09-blue_main.jpg',
    },
    //... 186 products omitted for brevity,
    {
        Name: 'Zoltan Gym Tee',
        Price: '$29.00',
        'Image URL': 'https://www.scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/ms06-blue_main.jpg',
    }
]

Bravo! You just built your first JavaScript pagination scraper using Axios and Cheerio.

Scrape JavaScript-based Pagination In Axios

JavaScript-based pagination loads content dynamically using JavaScript. This design is common in single-page applications (SPAs). 

A website can implement JavaScript-based pagination using infinite scrolling or a dynamic load more button.

Infinite scrolling loads data automatically as the user scrolls down the page. An example is the Infinite Scrolling Challenge page. See a demo of how it renders content below:

Infinite Scroll Demo
Click to open the image in full screen

For the load more pagination, the user needs to click a button to request content as they scroll. The Load More Challenge page below demonstrates how a load more pagination works:

Click to open the image in full screen

Standard HTTP clients like Axios aren't suitable for scraping these dynamic websites because they can't execute JavaScript. The best way to efficiently scrape JavaScript-rendered content is to use a headless browser like Puppeteer, Playwright, Selenium, or jsdom. These tools have browser automation features that simulate user interactions, including scrolling, clicking, hovering, and more.

Avoid Getting Blocked While Scraping Multiple Pages With Axios

You can easily get blocked when scraping multiple pages, especially if the requests are sent too quickly or in a way that doesn't mimic human behavior. To avoid getting blocked when scraping paginated websites, you need to implement measures to bypass anti-bots.

For example, your scraper will get blocked with a website like the Antibot Challenge page. Try it out with the following code:

scraper.js
// npm install axios
const axios = require('axios');

const url = 'https://www.scrapingcourse.com/antibot-challenge';

async function scraper() {
    try {
        // request the target URL
        const response = await axios.get(url);

        // log the full-page HTML
        console.log('Page Content:', response.data);
    } catch (error) {
        // log any errors that occur
        console.error('A error occurred:', error.message);
    }
}

// execute the function
scraper();

The above scraper got blocked by a 403 forbidden error:

Output
An error occurred: Request failed with status code 403

You can reduce the chances of anti-bot detection by patching your scraper with recommended custom request headers to mimic a real browser. Another solution is to use web scraping proxies to avoid IP bans or implement manual tweaks like fingerprint spoofing.

However, these techniques are usually unreliable as anti-bots evolve to spot patched bots accurately.

The most reliable way to scrape any website at scale without getting blocked is to use a web scraping solution, such as ZenRows' Univeral Scraper API. ZenRows is user-friendly and lets you bypass even the most complex anti-bot systems with a few code lines. It features premium proxy rotation, JavaScript rendering support, advanced fingerprint evasion, anti-bot auto-bypass, and more.

Let's see how the ZenRows' Universal Scraper API works by scraping the Antibot Challenge page that blocked you previously.

Sign up and go to the ZenRows Request Builder. Then, Paste the target URL in the link box and activate Premium Proxies and JS Rendering.

building a scraper with zenrows
Click to open the image in full screen

Choose Node.js as your programming language and select the API connection mode. Copy and paste the generated code into your crawler file:

The generated code should look like this:

scraper.js
// npm install axios
const axios = require('axios');

const url = 'https://www.scrapingcourse.com/antibot-challenge';
const apikey = '<YOUR_ZENROWS_API_KEY>';
axios({
    url: 'https://api.zenrows.com/v1/',
    method: 'GET',
    params: {
        url: url,
        apikey: apikey,
        js_render: 'true',
        premium_proxy: 'true',
    },
})
    .then((response) => console.log(response.data))
    .catch((error) => console.log(error));

The code outputs the protected website's full-page HTML as shown:

Output
<html lang="en">
<head>
    <!-- ... -->
    <title>Antibot Challenge - ScrapingCourse.com</title>
    <!-- ... -->
</head>
<body>
    <!-- ... -->
    <h2>
        You bypassed the Antibot challenge! :D
    </h2>
    <!-- other content omitted for brevity -->
</body>
</html>

Congratulations! 🎉Your Axios pagination scraper just got a boost with ZenRows' anti-bot bypass feature.

Conclusion

You've seen the different pagination types and how to scrape them in Node.js using Axios and Cheerio.

However, keep in mind that you'll often encounter anti-bot systems while scraping multiple pages. To scrape any website without getting blocked, we recommend integrating ZenRows, a complete web scraping solution. 

Try ZenRows for free now!

Ready to get started?

Up to 1,000 URLs for free are waiting for you