Web Crawling Webinar for Tech Teams
Web Crawling Webinar for Tech Teams

How to Scrape Amazon Reviews in 2025

Favour Kelvin
Favour Kelvin
October 10, 2024 · 6 min read

Are you looking to scrape Amazon reviews? We've got you covered! 

Product reviews are essential for qualitative analysis as they provide real user feedback and insights into how products perform in real-world scenarios. In this article, you'll learn the two methods for scraping Amazon reviews:

For learning purposes, you'll scrape reviews from this Amazon product page and extract the following details:

  • Reviewer Name.
  • Review Title.
  • Review Text.
  • Rating.
  • Date of the Review.
  • Review Image URL.
Amazon Logitech Mouse Reviews
Click to open the image in full screen

Before we begin, let's quickly understand Amazon's anti-scraping measures.

Understanding Amazon's Anti-Scraping Measures

Amazon uses several anti-scraping measures to prevent scraping attempts, leading to challenges like blocked access and disrupted data collection. Here are some common defenses.

CAPTCHAs

Amazon uses CAPTCHA to verify if a visitor is human or an automated script. It typically appears after repeated requests from the same IP address or when suspicious behavior is detected. CAPTCHAs pose a major challenge for scraping, as they completely halt the process unless bypassed.

Rate Limiting

Rate limiting restricts the number of requests that can be made to Amazon's servers within a specific time frame. If this limit is exceeded while scraping, Amazon may temporarily block the IP address or trigger a CAPTCHA for verification.

IP Blocking

Amazon monitors incoming traffic and can block or throttle IP addresses that show suspicious patterns, such as sending numerous requests at the same time. Once an IP is flagged, it can prevent further access, making data extraction more challenging.

Due to these measures, relying solely on basic scraping methods may not be enough. It's important to consider using advanced web scraping tools that incorporate features like rotating proxies, headless browsers, and CAPTCHA-solving capabilities.

In this section, we'll explore such solutions, with ZenRows being the top choice for scraping Amazon reviews.

Method #1: Scraping Amazon Reviews Using ZenRows Web Scraping API

ZenRows web scraping API lets you scrape any website without getting blocked. It can automatically adapt to changes in anti-bot measures. With rotating proxies, headless browsers, automatic CAPTCHA-solving, and more, ZenRows covers all the necessary features for effective scraping.

Here's what you can do with the ZenRows' Amazon scraper:

  • Automatically bypass CAPTCHAs and other anti-bot mechanisms.
  • Extract accurate data in JSON format without any hassle.
  • Parse data from various Amazon pages, including product listings, search results, reviews, and more.
  • Auto-rotate proxies to avoid Amazon's rate-limited IP bans.
  • Access localized products in 185+ countries.

With just a single API call, ZenRows handles the scraping process for you. Let's try it on the target page to see how it can scrape Amazon reviews.

Step 1: Set up ZenRows

Sign up for ZenRows and access the Request Builder.

Paste the product URL into the link box. Activate Premium Proxies and JS Rendering to ensure the scraper works smoothly. Select Python as your programming language and choose API connection mode.

Step 2: Add the Review CSS Selectors

To scrape reviews, you'll first need to inspect the webpage and find the CSS selectors for the specific review details you want to extract (e.g., reviewer name, review title, and review text).

Open the target page in your browser. Right-click on the review details and select Inspect. The browser's Developer Tools will open, highlighting the HTML code for that element. You then need to locate the CSS selector that identifies the element you want to extract.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE
Amazon Reviews Name
Click to open the image in full screen

We've already identified the CSS selectors for extracting Amazon review details. Below is how you can organize them in a JSON format:

Example
{
  "reviewer_name": "span.a-profile-name",
  "review_title": "a.review-title",
  "review_date": "span.review-date",
  "review_text": "span.review-text",
  "review_rating": "i.review-rating",
  "review_image": "img.review-image-tile @src"
}

Copy and paste this JSON in the CSS Selector tab as shown below:

building a scraper with zenrows
Click to open the image in full screen

ZenRows automatically generates the Python code for you. Copy and paste the code into your Python file:

Example
# pip install requests
import requests

# set up the URL and API key
url = 'https://www.amazon.com/Logitech-G502-Performance-Gaming-Mouse/dp/B07GBZ4Q68/?th=1'
apikey ="<YOUR_ZENROWS_API_KEY>"

# parameters for the API request
params = {
    'url': url,
    'apikey': apikey,
    'css_extractor': """{
        "reviewer_name": "span.a-profile-name",
        "review_title": "a.review-title",
        "review_date": "span.review-date",
        "review_text": "span.review-text",
        "review_rating": "i.review-rating",
        "review_image": "img.review-image-tile @src"
    }"""
}

# make the API request
response = requests.get('https://api.zenrows.com/v1/', params=params)

# print the response
print(response.text)

The above script will parse the review data and return a JSON format of the extracted review details:

Output
{
    "review_date": [
        "Reviewed in the United States on March 14, 2021",
        "Reviewed in the United States on January 31, 2024",
        "Reviewed in the United States on July 24, 2024",
        // ... omitted for brevity
    ],
    "review_image": "https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif",
    "review_rating": [
        "5.0 out of 5 stars",
        "5.0 out of 5 stars",
        "5.0 out of 5 stars",
        // ... omitted for brevity
    ],
    "review_text": [
        "There is a ton to like about this mouse from how it fits into your hand...",
        "This mouse has a great feel to it and it's very nice and smooth...",
        // ... omitted for brevity
    ],
    "review_title": [
        "5.0 out of 5 stars\n\nBest mouse hands down!",
        "5.0 out of 5 stars\n\nGreat productivity and flexible mouse that can be programmed...",
        // ... omitted for brevity
    ],
    "reviewer_name": [
        "Logitech G502 Hero Gaming Mouse Video",
        "Amazon Customer",
        // ... omitted for brevity
    ]
}

Congrats! You just parsed Amazon reviews automatically with ZenRows.

Now, let's look at the second method for scraping Amazon reviews.

Method #2: Scraping Amazon Reviews Using Python and BeautifulSoup

In this section, we'll use Python's Requests library to send HTTP requests and BeautifulSoup to extract the review details from the target Amazon product page. Before diving into the code, let's make sure everything is set up.

Step 1: Prerequisite

Ensure you have Python installed. If not, head over to the official Python download page and get the latest version.

Next, install BeautifulSoup and Requests using pip:

Terminal
pip3 install requests beautifulsoup4

We're using VS Code for this tutorial, but feel free to use any IDE you prefer.

Once your environment is ready, you're good to go!

Step 2: Access the Amazon Page

We'll start by making a basic request to fetch the full HTML of the target page. This ensures that your HTTP client can retrieve the website's content successfully.

Create a scraper.py  file in your project directory and add the following code:

scraper.py
import requests
from bs4 import BeautifulSoup

# specify the target URL
target_url = "https://www.amazon.com/Logitech-G502-Performance-Gaming-Mouse/dp/B07GBZ4Q68/?th=1"

# send a get request to the target url
response = requests.get(target_url)

# check if the response status code is not 200 (OK)
if response.status_code != 200:
   # print an error message with the status code
   print(f"An error occurred with status {response.status_code}")
else:
   # get the page html content
   html_content = response.text
  
   # parse the html content using BeautifulSoup
   soup = BeautifulSoup(html_content, "html.parser")
  
   # print the parsed HTML in a readable format
   print(soup.prettify())

Running this above code will output the full HTML as shown below. Some content has been omitted for brevity:

Output
<!DOCTYPE html>
<html lang="en-us" class="a-no-js" data-19ax5a9jf="dingo">
<head>
    <!-- ... -->
    <title>Amazon.com: Logitech G502 Performance Gaming Mouse</title>
    <!-- ... -->
</head>
<body>
    <!-- ... -->
</body>
</html>

All set! Now, let's dive into the real task—scraping Amazon reviews!

Locate and Scrape Reviewer Names

To start, open the Amazon product page in your browser, right-click on a reviewer's name, and select Inspect to reveal its CSS selector.

You will see the reviewer's name inside a <span> tag with a class name a-profile-name.

Amazon Reviews Name
Click to open the image in full screen

Next, use BeautifulSoup's find_all method to locate all elements with the class name. Loop through the found elements and, for each one, use strip to remove any whitespace. Create a reviews dictionary to store the cleaned reviewer names:

scraper.py
# ...
else:
   # find all elements with class name "a-profile-name"
   reviewer_names = soup.find_all("span", class_="a-profile-name")
   names_list = [name.text.strip() for name in reviewer_names]

   # create a dictionary to store the review details
   reviews = {
       "Reviewer Names": names_list
   }

   # print the dictionary
   print(reviews)

The above code outputs the reviewer names as shown:

Output
{'Reviewer Names': ['Logitech G502 Hero Gaming Mouse Video', 'Amazon Customer', 'Amazon Customer', 'Subnet', 'Craigc', 'Craigc', 'Justin Frattallone', 'Jules simard', 'Jules simard', 'Pedro Gavidia', 'Valceir', 'Valceir', 'Sami AL-Shammari', 'Tanuj Singh']}

Great job! Let's move on.

Locate and Scrape Review Title

Similarly, to extract the review title, right-click on the review title element on the website to get the CSS selector. 

The review title is in an <a> tag with the class name review-title.

Amazon Reviews Title
Click to open the image in full screen

Use the find_all method to locate and return a list of elements matching the class name. Then, iterate through them, using replace to remove unwanted text and strip to clean up whitespace. Finally, update the reviews dictionary, as shown below:

scraper.py
# ...
else:
   # find all elements with class name "review-title"
   review_titles = soup.find_all("a", class_="review-title")
   titles_list = [title.text.replace("5.0 out of 5 stars\n", "").strip() for title in review_titles]

   # create a dictionary to store the review details
   reviews = {
       # ...,
       "Review Titles": titles_list
   }

   # print the dictionary
   print(reviews)

The code will output all the review titles like this:

Output
{'Review Titles': ['Best mouse hands down!', 'Great productivity and flexible mouse that can be programmed however you want', 'The stories are true: all around, great gaming mouse', ...]}

Nice work! You've successfully scraped the review titles. 

Locate and Scrape Review Text

Next, right-click and inspect the review text to see its element and CSS selector. The review text is in the <div> tag with the class name review-text. This targets all reviews, whether collapsed or expanded.

Amazon Reviews Body
Click to open the image in full screen

The find_all method locates all the <span> elements with the class name. Then, it loops, extracts and updates the dictionary with the clean review text:

scraper.py
# ...
else:
   # find all elements with class name "review-text-content"
   review_texts = soup.find_all("span", class_="review-text")
   review_texts_list = [text.get_text(separator="\n").strip() for text in review_texts]

   # create a dictionary to store the review details
   reviews = {
       "Review Texts": review_texts_list
   }

   # print the dictionary
   print(reviews)

The above code will add the review text to the output as shown:

Output
{'Review Texts': ["There is a ton to like about this mouse from how it fits into your hand, the grip, and the competitive edge for gaming you will begin to notice immediately!1) Daily useNot everything is about gaming, as this only takes up maybe 2-3 hours max of my time a day ...]}

Great work so far! Let's keep going!

Locate and Scrape Review Date

Now that we've got the review text, let's get the dates for when these reviews were posted. 

Right-click the date and click Inspect. The review date is in the <span> tag with the class name review-date.

Amazon Reviews Date
Click to open the image in full screen

BeautifulSoup's find_all method uses the review-date class name to find all matching elements. For each date element, the strip removes any trailing whitespace and adds the clean date to the reviews dictionary:

scraper.py
# ...
else:
   # find all elements with class name "review-date"
   review_dates = soup.find_all("span", class_="review-date")
   review_dates_list = [date.text.strip() for date in review_dates]

   # create a dictionary to store the review details
   reviews = {
       "Review Dates": review_dates_list
   }

   # print the dictionary
   print(reviews)

The code will extract the review dates as shown below:

Output
{'Review Dates': ['Reviewed in the United States on March 14, 2021', 'Reviewed in the United States on January 31, 2024', 'Reviewed in the United States on July 24, 2024', '... omitted for brevity', 'Reviewed in India on September 15, 2023']}

You've successfully scraped the review dates. 

Locate and Scrape Rating

Now, let's extract the rating from the reviews. Right-click on the star rating displayed in the review and select Inspect. The rating is an <i> tag with the class name review-star-rating.

Amazon Reviews Rating
Click to open the image in full screen

The find_all method searches for all <i> elements with the rating class name. 

For each rating, the text content (the rating value) is extracted, cleaned, and the reviews dictionary is updated with the cleaned review ratings:

scraper.py
# ...
else:
   # find all elements with class name "review-rating"
   review_ratings = soup.find_all("i", class_="review-rating")
   review_ratings_list = [rating.text.strip() for rating in review_ratings]

   # create a dictionary to store the review details
   reviews = {
       "Review Star Ratings": review_ratings_list
   }

   # print the dictionary
   print(reviews)

Once you run the code, here's how the output will look:

Example
{'Review Ratings': ['5.0 out of 5 stars', '5.0 out of 5 stars', '5.0 out of 5 stars', ...]}

Great!  Now, let's move on to the last item.

Locate and Scrape Review Image URL 

If a reviewer has uploaded images, they can be extracted as well. To find the image URL, right-click on any review image and select Inspect. The review image URL has the class name review-image-tile.

Amazon Reviews Image
Click to open the image in full screen

Use BeautifulSoup's find_all method to locate the review image elements using the CSS selector. Extract the image URLs by accessing the src attribute and then update the reviews dictionary with the image URLs:

scraper.py
# ...
else:
   # find all img elements with class name "review-image-tile"
   review_images = soup.find_all("img", class_="review-image-tile")
   image_urls = [img["src"] for img in review_images]

   # create a dictionary to store the review details
   reviews = {
       "Review Image URLs": image_urls
   }

   # print the dictionary
   print(reviews)

Here's what the output will look like if the review contains an image:

Output
{'Review Image URLs': ['https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif']}

With this final step, you've successfully captured all key review details, including any attached images!

Let's combine all the snippets to see what the complete code looks like:

scraper.py
import requests
from bs4 import BeautifulSoup

# specify the target URL
target_url = "https://www.amazon.com/Logitech-G502-Performance-Gaming-Mouse/dp/B07GBZ4Q68/?th=1"

# send a get request to the target url
response = requests.get(target_url)

# check if the response status code is not 200 (OK)
if response.status_code != 200:
   # print an error message with the status code
   print(f"An error occurred with status {response.status_code}")
else:
   # get the page html content
   html_content = response.text
   # parse the html content using BeautifulSoup
   soup = BeautifulSoup(html_content, "html.parser")

   # find all elements with class name "a-profile-name"
   reviewer_names = soup.find_all("span", class_="a-profile-name")
   names_list = [name.text.strip() for name in reviewer_names]

   # find all elements with class name "review-title"
   review_titles = soup.find_all("a", class_="review-title")
   titles_list = [title.text.replace("5.0 out of 5 stars\n", "").strip() for title in review_titles]
  
   # find all elements with class name "review-text-content"
   review_texts = soup.find_all("span", class_="review-text")
   review_texts_list = [text.get_text(separator="\n").strip() for text in review_texts]

   # find all elements with class name "review-date"
   review_dates = soup.find_all("span", class_="review-date")
   review_dates_list = [date.text.strip() for date in review_dates]

   # find all elements with class name "review-rating"
   review_ratings = soup.find_all("i", class_="review-rating")
   review_ratings_list = [rating.text.strip() for rating in review_ratings]

   # find all img elements with class name "review-image-tile"
   review_images = soup.find_all("img", class_="review-image-tile")
   image_urls = [img["src"] for img in review_images]

   # create a dictionary to store the review details
   reviews = {
       "Reviewer Names": names_list,
       "Review Titles": titles_list,
       "Review Texts": review_texts_list,
       "Review Dates": review_dates_list,
       "Review Star Ratings": review_ratings_list,
       "Review Image URLs": image_urls
   }

   # print the dictionary
   print(reviews)

Congrats! You've successfully extracted the Amazon review details! But we're not done yet; let's take it a step further and export all this data to a CSV file. 

Step 3: Export to CSV

To export the scraped review data to a CSV file, ​​we'll use Python's built-in csv module. The CSV format allows you to easily organize the review data in a structured format, making it easier to analyze or share.

Let's modify our previous code to export the reviews to a CSV file.

First, import Python's csv package. Specify a CSV file name, open a new file in write mode, and write the header row to define the columns for the review data. If any data is missing, it will be replaced with "N/A":

scraper.py
# ...
import csv

# ...

# specify the CSV file name
csv_file = "amazon_reviews.csv"

# open the file in write mode
with open(csv_file, mode="w", newline="", encoding="utf-8") as file:
   # create a CSV writer object
   writer = csv.writer(file)

   # write the header row
   writer.writerow(["Reviewer Name", "Review Title", "Review Text", "Review Date", "Star Rating", "Image URL"])

   # write the review data
   for i in range(len(names_list)):
       writer.writerow([
           names_list[i],
           titles_list[i] if i < len(titles_list) else "N/A",
           review_texts_list[i] if i < len(review_texts_list) else "N/A",
           review_dates_list[i] if i < len(review_dates_list) else "N/A",
           review_ratings_list[i] if i < len(review_ratings_list) else "N/A",
           image_urls[i] if i < len(image_urls) else "N/A"
       ])

print(f"Data successfully exported to {csv_file}")

Here is the final updated scraper code:

scraper.py
import requests
from bs4 import BeautifulSoup
import csv

# specify the target URL
target_url = "https://www.amazon.com/Logitech-G502-Performance-Gaming-Mouse/dp/B07GBZ4Q68/?th=1"

# send a get request to the target url
response = requests.get(target_url)

# check if the response status code is not 200 (OK)
if response.status_code != 200:
    # print an error message with the status code
    print(f"An error occurred with status {response.status_code}")
else:
    # get the page html content
    html_content = response.text

    # parse the html content using BeautifulSoup
    soup = BeautifulSoup(html_content, "html.parser")

    # find all elements with class name "a-profile-name"
    reviewer_names = soup.find_all("span", class_="a-profile-name")
    names_list = [name.text.strip() for name in reviewer_names]

    # find all elements with class name "review-title"
    review_titles = soup.find_all("a", class_="review-title")
    titles_list = [title.text.replace("5.0 out of 5 stars\n", "").strip() for title in review_titles]
  
    # find all elements with class name "review-text-content"
    review_texts = soup.find_all("span", class_="review-text")
    review_texts_list = [text.get_text(separator="\n").strip() for text in review_texts]

    # find all elements with class name "review-date"
    review_dates = soup.find_all("span", class_="review-date")
    review_dates_list = [date.text.strip() for date in review_dates]

    # find all elements with class name "review-rating"
    review_ratings = soup.find_all("i", class_="review-rating")
    review_ratings_list = [rating.text.strip() for rating in review_ratings]

    # find all img elements with class name "review-image-tile"
    review_images = soup.find_all("img", class_="review-image-tile")
    image_urls = [img["src"] for img in review_images]

    # create a dictionary to store the review details
    reviews = {
        "Reviewer Names": names_list,
        "Review Titles": titles_list,
        "Review Texts": review_texts_list,
        "Review Dates": review_dates_list,
        "Review Star Ratings": review_ratings_list,
        "Review Image URLs": image_urls
    }

    # print the dictionary
    print(reviews)

    # specify the CSV file name
    csv_file = "amazon_reviews.csv"
    # open the file in write mode
    with open(csv_file, mode="w", newline="", encoding="utf-8") as file:
        # create a CSV writer object
        writer = csv.writer(file)

        # write the header row
        writer.writerow(["Reviewer Name", "Review Title", "Review Text", "Review Date", "Star Rating", "Image URL"])

        # write the review data
        for i in range(len(names_list)):
            writer.writerow([
                names_list[i],
                titles_list[i] if i < len(titles_list) else "N/A",
                review_texts_list[i] if i < len(review_texts_list) else "N/A",
                review_dates_list[i] if i < len(review_dates_list) else "N/A",
                review_ratings_list[i] if i < len(review_ratings_list) else "N/A",
                image_urls[i] if i < len(image_urls) else "N/A"
            ])

    print(f"Data successfully exported to {csv_file}")

Once you run this code, all the extracted Amazon review details will be stored in a CSV file with the following columns:

Amazon Reviews CSV
Click to open the image in full screen

With this final step, your scraper is now complete! You've not only gathered all the necessary review data but also exported it to a CSV file, making it easy to analyze and share.

Conclusion

In this tutorial, you've seen how to scrape Amazon product reviews using two different methods—ZenRows web scraping API and BeautifulSoup in Python. Here's a quick recap of what you've learned:

  • Extracted reviews from Amazon using ZenRows web scraping API.
  • Scraped reviews using Python's Requests library and BeautifulSoup.
  • Gathered key review details such as reviewer names, review titles, review texts, ratings, dates, and images.
  • Exported scraped data into a CSV or JSON file for easy analysis and sharing.

If you're looking for another approach, check out our guide on scraping Amazon with Scrapy.

Scraping Amazon reviews can be tricky, but with the right tools and techniques, you can gather the data you need. We recommend using ZenRows for a smoother, stress-free scraping experience.

Try ZenRows for free—no credit card required!

Ready to get started?

Up to 1,000 URLs for free are waiting for you