Web Crawling Webinar for Tech Teams
Web Crawling Webinar for Tech Teams

Set Wget User Agent: How-to & Best Practices

Rubén del Campo
Rubén del Campo
Updated: January 28, 2025 · 5 min read

Getting blocked while web scraping can be frustrating, but the most important factor is to change the User Agent in Wget. So, let's learn how to do that.

What Is the Wget User Agent?

The User Agent in Wget is a crucial component of the HTTP headers sent along with every request. These HTTP request headers are metadata that provide additional information to the web server, e.g. to inform on caching behavior, session management, web client capabilities, and so on. This information is analyzed by Web Application Firewalls (WAFs) like Cloudflare and other WAF systems to detect and block automated traffic. 

Most importantly, the User Agent (UA) provides details about the web client, such as its name, version, and operating system. Here's a sample Google Chrome browser UA string:

Example
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36

It tells the web server that the request comes from a Chrome browser with version 92.0.4515.159, running on Windows 10, among other details.

However, your default Wget User Agent typically looks like this:

Example
Wget/1.21.4

You can see yours using the following command:

Terminal
wget --version

From the above UAs, you can understand how easily websites can distinguish between Wget requests and an actual browser. That's why you need to set a custom Wget User Agent.

How Do I Set a Custom User Agent in Wget

Follow the steps below to change your User Agent in Wget.

Step 1: Customize UA

To overwrite Wget's default UA, you must add a --user-agent or -U option, followed by a new UA, on a request.

To see it in action, let's use the real sample shown above and target HTTPbin, which displays the used user-agent string.

Example
$ wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "https://httpbin.io/user-agent"

Hit enter, and Wget will automatically use the custom User Agent to make the request and retrieve the page content, and save it as user-agent in your project folder. 

Click to open the image in full screen

The user-agent file now contains your custom UA in JSON format.

Output
{
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36"
}

Congrats, you've successfully changed your Wget User Agent to appear like a Chrome browser.

However, using a single custom UA isn't enough. Keep reading to fix that.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

Step 2: Use a Random User Agent in Wget

Randomizing your Wget User-Agent is critical to avoid getting blocked, especially when making many requests. Websites often flag "too many" requests as suspicious activity and can deny you access.

But you can use a random UA per request, as it appears to the web server as though the requests come from different browsers (users).

To get started, create a text file in your project folder containing a list of UAs. We've taken a few from our list of web scraping User Agents.

  • Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10\_15\_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
  • Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36

Next, use a scripting language, like Python, to select a random UA from the file and pass it to Wget.

For that, ensure you have Python installed and create a .py file.

Then, in your favorite editor, import the libraries subprocess and random. The Subprocess module is useful for running external commands from within a Python script, while we'll use Random to select UAs from the list at random. Next, set your Wget User Agent list and read them into a list variable (we've named this user_agents).

program.py
import subprocess
import random
 
# List of User Agents in a text file (one per line)
user_agent_file = "user_agents.txt"
 
# Read User Agents from the file
with open(user_agent_file, "r") as file:
    user_agents = file.read().splitlines()

Lastly, select a random UA from the list and pass it into the Wget command.

program.py
#...
 
# Choose a random User Agent
random_user_agent = random.choice(user_agents)
 
# Use wget with the random User Agent
url = "https://httpbin.io/user-agent"  # Replace with your URL
command = f'wget --user-agent="{random_user_agent}" {url}'
subprocess.call(command, shell=True)

Putting it all together, here's the complete code:

program.py
import subprocess
import random
 
# List of User Agents in a text file (one per line)
user_agent_file = "user_agents.txt"
 
# Read User Agents from the file
with open(user_agent_file, "r") as file:
    user_agents = file.read().splitlines()
 
# Choose a random User Agent
random_user_agent = random.choice(user_agents)
 
# Use wget with the random User Agent
url = "https://httpbin.io/user-agent"  # Replace with your URL
command = f'wget --user-agent="{random_user_agent}" {url}'
subprocess.call(command, shell=True)

Run the Python script, and it will make a Wget request using a random UA from your list. Rerun it to make the same request multiple times, and you'll observe that Wget uses different UAs for each request.

Here's our result for three requests:

Output
{
  "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
}
 
{
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
}
 
{
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"
}

Bingo! You've rotated your first UAs with Wget.

To add more UAs to the list, you must ensure they're properly constructed to avoid detection. For example, if your UA claims to be a specific browser (e.g., Mozilla Firefox) with a version number that doesn't match its actual version, it can raise suspicion. Also, keeping your User Agents up-to-date is critical.

That said, building and maintaining a list of adequately constructed UAs can take time and effort. But no worries since the next section shows you the easiest solution.

How To Rotate Wget User Agents at Scale

Creating a reliable User Agent rotation system is harder than just making a list. You need to keep updating browser versions, check if they match operating systems correctly, and remove old combinations.

Plus, websites don't just look at User Agents anymore to spot bots. They check things like your request behavior, network patterns, connection details, IP reputation, and more. Even with perfect User Agents, your Wget requests might still get blocked.

A better way is to use ZenRows' Universal Scraper API, it automatically handles User Agents for you, rotates IP addresses, auto-bypasses all CAPTCHAs, and provides you with everything you need to avoid getting blocked.

Let's try ZenRows with a page that blocks Wget, like the Antibot Challenge page.

First, sign up for a ZenRows account to get to the Request Builder.

building a scraper with zenrows
Click to open the image in full screen

Paste the target URL, enable JS Rendering, and activate Premium Proxies.

Next, select cURL and click on the API connection mode. Then, copy the generated code and paste it into your script.

Terminal
curl 
"https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.scrapingcourse.com%2Fantibot-challenge&js_render=true&premium_proxy=true"

When you run this code, you'll successfully access the page:

Output
<html lang="en">
<head>
    <!-- ... -->
    <title>Antibot Challenge - ScrapingCourse.com</title>
    <!-- ... -->
</head>
<body>
    <!-- ... -->
    <h2>
        You bypassed the Antibot challenge! :D
    </h2>
    <!-- other content omitted for brevity -->
</body>
</html>

Congratulations! 🎉 You've successfully accessed a protected page without any complex Wget setup.

Conclusion

In this guide, you've learned key things about User Agents in Wget:

What User Agents are and how they work with Wget. How to set custom User Agents in your commands. Ways to rotate between different User Agents. Why User Agent management alone isn't enough.

Remember that websites use many ways to detect bots. Instead of managing everything yourself, use ZenRows to scrape any page without getting blocked. Try ZenRows for free!

Ready to get started?

Up to 1,000 URLs for free are waiting for you