Top List of User Agents for Web Scraping & Tips

January 18, 2023 ยท 4 min read

Using an incorrect user agent when web scraping or not applying some related best practices is a recipe to get blocked by antibots. To solve that, you'll find here a list of the best user agents for scraping and some tips to use them.

Ready? Let's go!

What Is a User Agent?

User Agent (UA) is a string sent by the user's web browser to a web server in the HTTP header to identify the browser type in use, its version and the operating system. Accessed with JavaScript on the client side using navigator.userAgent property, the remote web server uses this information to identify and render the content in a way that's compatible with the device and browser used.

While different structures and information are contained, most web browsers tend to follow the same format:

Mozilla/5.0 (<system-information>) <platform> (<platform-details>) <extensions>

For example, a user agent string for Chrome (Chromium) might be Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36. Note that it contains the name of the browser (Chrome), the version number (109.0.0.0) and the operating system that the browser is running on (Windows NT 10.0, 64-bit processor).

Why Is a User Agent Important for Web Scraping?

Since UA strings help web servers identify the type of browser (and bots) requested, using a list of user agents for scraping can help mask your scraper as a web browser.

Beware that using a wrongly formed user agent will get your data extraction script blocked.

What Are the Best User Agents for Scraping?

We compiled a list of the best user agents for web scraping for emulating a browser and avoid getting blocked:
  • Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
  • Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15

How to Check User Agents and Understand Them

The easiest way to check your current scraping user agent is to visit UserAgentString.com. It automatically displays the user agent for your web browsing environment, and you can get comprehensive information on other user agents. To do this, copy/paste any string in the input field and click on 'Analyze'.

User Agent String explained
Click to open the image in fullscreen
Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

How to Set a New User Agent Header in Python?

Let's run a quick example of changing a scraper user agent using Python requests. We'll use a string associated with Chrome:

Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36

Use the following code snippet to set the User-Agent header while sending the request using Python:

import requests 
 
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"} 
 
# You can test if your web scraper is sending the correct header by sending a request to HTTPBin 
r = requests.get("https://httpbin.org/headers", headers=headers) 
print(r.text)

The output of the request will look like this:

{ 
	"headers": { 
		"Accept": "*/*", 
		"Accept-Encoding": "gzip, deflate", 
		"Host": "httpbin.org", 
		"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36", 
		"X-Amzn-Trace-Id": "Root=1-63c42540-1a63b1f8420b952f1f0219f1" 
	} 
}

And that's it! You now have a new user agent for scraping.

How to Avoid Getting Your UA Banned

Although using a user agent for web scraping can reduce your chances of getting blocked, sending too many requests from the same UA can trigger the antibot system, eventually blocking your scraper. The best way to avoid this is to use browser user agents, rotate through a list of user agents for scraping and keep them up to date.

1. Rotate User Agents

Rotating scraping user agents is simply changing the UAs while making web requests, letting you access more data and increasing your scraper's efficiency. This method can help protect your IP address from getting blocked and blacklisted.

How to Rotate User Agents

To rotate them, start by obtaining a list of user agents for scraping. You can get some real ones from WhatIsMyBrowser.

We'll use these 3 browser UA strings and put them in the Python list():

user_agent_list = [ 
	'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36', 
	'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36', 
	'Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15', 
]

Use a for loop and random.choice() to select a random scraper user agent from the list():

for i in range(1,4): 
	user_agent = random.choice(user_agent_list)

Set the UA header and then send the request:

headers = {'User-Agent': user_agent} 
response = requests.get(url, headers=headers)

Here's what the full code should look like:

import requests 
import random 
 
user_agent_list = [ 
	'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36', 
	'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36', 
	'Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15', 
] 
 
url = 'https://httpbin.org/headers' 
for i in range(1, 4): 
	user_agent = random.choice(user_agent_list) 
	headers = {'User-Agent': user_agent} 
	response = requests.get(url, headers=headers) 
	received_ua = response.json()['headers']['User-Agent'] 
 
	print("Request #%d\nUser-Agent Sent: %s\nUser-Agent Received: %s\n" % (i, user_agent, received_ua)) 

This is your output after running the request:

Request #1 
User-Agent Sent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36 
User-Agent Received: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36 
 
Request #2 
User-Agent Sent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 
User-Agent Received: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 
 
Request #3 
User-Agent Sent: Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15 
User-Agent Received: Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15

2. Keep Random Intervals between Requests

Maintain random intervals between requests to prevent your web scraper from getting detected and blocked.

You might be interested in reading our guide on how to bypass rate limit while web scraping.

3. Use Up-to-date User Agents

To keep your web scraping smooth and uninterrupted, ensure that your scraper user agents are regularly updated because outdated ones can get your IP blocked.

Conclusion

A user agent in web scraping lets you mimic the behavior of a web browser, helping you access a website as a user in order to scrape a webpage without getting blocked.

In this article, we shared some of the best UA strings and tips. Here's a recap on how to use these user agents in web scraping:
  • Rotate your user agent to avoid bot detection.
  • Keep random intervals between requests.
  • Keep your user agents updated.

Modern websites use different anti-scraping techniques to detect web scraping bots. Using the best scraping user agent lowers the risks of getting blocked, but it may not always work. To avoid uncertainties and headaches, many people use a web scraping API, like ZenRows.

ZenRows is capable of bypassing antibots and CAPTCHAs while scraping. It has rotating premium proxies and up to a 99.9% uptime guarantee. Oh, and you can get started for free.

Did you find the content helpful? Spread the word and share it on Twitter, LinkedIn, or Facebook.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Want to keep learning?

We will be sharing all the insights we have learned through the years in the following blog posts. If you don't want to miss a piece and keep learning, we'd be thrilled to have us in our newsletter.

No spam guaranteed. You can unsubscribe at any time.