Set a Urllib3 Proxy: Tutorial 2025 

November 6, 2023 · 4 min read

Let me guess, you're facing this frustrating error: Access denied: 403 Forbidden .

Don't worry. In this guide, I'll show you how to fix this by using a proxy with urllib3 when web scraping in Python.

What Is a Urllib3 Proxy?

A Urllib3 proxy is a tool to route HTTP requests through an intermediary server, which acts as a bridge between you and your target web page. Your requests are first forwarded to the proxy server, then it communicates with the page and returns you the response. 

How to Set a Proxy with Urllib3

Here are the steps to set up a proxy with Urllib3 (including additional tweaks that'll increase your chances of avoiding detection.)

Step 1: Get Started with Urllib3

Let's begin by creating a basic Urllib3 scraper that makes a normal HTTP request to a target URL. 

Here's a basic Urllib3 script that makes a GET request to Httpbin, an API that returns the client's IP address. For that, it uses Urllib3's PoolManager instance, which serves as a centralized manager for handling connections to web servers. It also covers the complexities of connection pooling, so you don't have to.

Terminal
import urllib3
 
# Create a PoolManager instance for sending requests.
http = urllib3.PoolManager()
 
# Send a GET request
resp = http.request("GET", "http://httpbin.io/ip")
 
# Print the returned data.
print(resp.data)

The result of the request above should be your IP address.

Output
b'{\n  "origin": "107.010.55.0"\n}\n'

Step 2: Set a Urllib3 Proxy

For this step, you need a proxy, but you can grab a free one from FreeProxyList. We recommend using HTTPS proxies because they work for both HTTPS and HTTP requests.

Urllib3 provides a ProxyManager object for tunneling requests through a proxy. So, to configure that, create a ProxyManager instance and pass your proxy URL as an argument. Then, make your request through it and log your response.

Like the PoolManager object, ProxyManager handles all the details of connections. So, we don't need both.

Terminal
import urllib3
 
# Create a Proxy Manager for managing proxy servers
proxy = urllib3.ProxyManager("http://75.89.101.60:80")
 
# Make GET request through the proxy
response = proxy.request("GET", "http://httpbin.io/ip")
 
#Print the returned data
print(response.data)

Run the script, and your response should be your proxy's IP address.

Output
b'{\n  "origin": "75.89.101.61:38706"\n}\n'

Congrats! You've set up your first Urllib3 proxy.

However, we used a free proxy in the above example, which is an unreliable option. In a real-world scenario, you'll need premium proxies for web scraping, which often requires additional configuration. Let's see how to use such in Urllib3.

Step 3: Proxy Authentication with Urllib3: Username & Password

Most premium proxies require authentication to verify the user's legitimacy. Proxy providers use this as a crucial security measure to control who can access their servers.

To authenticate a proxy with Urllib3, you must provide credentials (username and password) as part of the request headers when sending a request through a proxy server.

However, to ensure your credentials are properly interpreted as headers, they must be encoded. For that, Urllib3 provides the urllib3.util.make_headers function that takes your username and password as arguments and returns a dictionary containing the right headers for HTTP requests. 

So, if the proxy in step 2 were premium, you'd authenticate it by encoding your credentials into request headers and using the new headers in your HTTP request, like in the code below.

Terminal
import urllib3
 
# Build headers for the basic_auth component
auth_creds = urllib3.util.make_headers(proxy_basic_auth="<YOUR_USERNAME>:<YOUR_PASSWORD>")
 
# Create a Proxy Manager for managing proxy servers
proxy = urllib3.ProxyManager("http://75.89.101.60:80", proxy_headers=auth_creds)
 
# Make GET request through the proxy
response = proxy.request("GET", "http://httpbin.io/ip")
 
#Print the returned data
print(response.data)
Premium residential proxies to avoid getting blocked.
Access all the data you need with ZenRows' residential proxy network.
Try for Free

Step 4: Rotate Proxies (You Need to!)

Websites flag too many requests as suspicious activity and can block your proxy. Fortunately, you can avoid that by rotating through multiple proxies. This way, you distribute traffic across multiple IP addresses, making your requests appear to come from different users.

To rotate proxies with Urllib3, create a proxy list and randomly select one for each request. 

Here's a step-by-step example.

Import random and define your proxy list. You can grab a few proxies from FreeProxyList to create your list (we'll see how to do the same with premium ones later on).

scraper.py
import urllib3 
import random
 
# Define a list of proxy URLs
proxy_list = [
    "http://8.219.97.248:80",
    "http://50.168.49.109:80",
    # Add more proxy URLs as needed
]
 
#..

Randomly select a proxy from the list using the random.choice() method. Then, create a ProxyManager instance using the random proxy, make your request, and log the response like in step 2.

Terminal
# Randomly select a proxy from the list
proxy_url = random.choice(proxy_list)
 
# Create a Proxy Manager instance using the random proxy
proxy = urllib3.ProxyManager(proxy_url)
 
# Make GET request through the proxy
response = proxy.request("GET", "http://httpbin.io/ip")
 
#Print the returned data
print(response.data)

Putting everything together, your complete code should look like this:

scraper.py
import urllib3 
import random
 
# Define a list of proxy URLs
proxy_list = [
    "http://8.219.97.248:80",
    "http://50.168.49.109:80",
    # Add more proxy URLs as needed
]
 
# Randomly select a proxy from the list
proxy_url = random.choice(proxy_list)
 
# Create a Proxy Manager instance using the random proxy
proxy = urllib3.ProxyManager(proxy_url)
 
# Make GET request through the proxy
response = proxy.request("GET", "http://httpbin.io/ip")
 
#Print the returned data
print(response.data)

To verify it works, run the code multiple times. You should get a different IP address for each request.

Here are our results for two requests:

Output
b'{\n  "origin": "8.219.97.248"\n}\n'
b'{\n  "origin": "50.168.49.109:"\n}\n'

Bingo! 

It's important to note that we only used free proxies with Urllib3 in this example to explain the concept. As mentioned before, you'll need premium proxies when making requests to real websites because free ones are prone to failure and easily detected.

Premium Proxy to Avoid Getting Blocked

Free proxies present serious challenges for web scraping. Their unreliable connections, potential security issues, and low-quality IP addresses make them unsuitable for production use. Websites frequently identify and block these free proxies, making them ineffective for sustained operations.

Premium proxies provide a more dependable solution for avoiding blocks. By leveraging residential IPs, premium proxies can naturally simulate real user traffic. With features like automated IP rotation and geographic targeting options, they significantly enhance the success rate of your requests.

ZenRows' Residential Proxies is a leading premium proxy service that gives you access to over 55M+ residential IPs across 185+ countries. The service includes essential features like dynamic IP rotation, smart proxy selection, and configurable geo-targeting, all backed by 99.9% network uptime. This makes it an ideal choice for reliable HTTP requests using urllib3.

Let's see how to implement ZenRows' Residential Proxies with Urllib3.

First, sign up and you'll get to the Proxy Generator dashboard. Your proxy credentials will be generated automatically.

generate residential proxies with zenrows
Click to open the image in full screen

Take your proxy credentials (username and password) and use them in the following code:

scraper.py
import urllib3

# build headers for the basic_auth component
auth_creds = urllib3.util.make_headers(
    proxy_basic_auth="<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>"
)

# create a Proxy Manager for managing proxy servers
proxy = urllib3.ProxyManager(
    "http://superproxy.zenrows.com:1337", proxy_headers=auth_creds
)

# make GET request through the proxy
response = proxy.request("GET", "https://httpbin.io/ip")

# print the returned data
print(response.data)

Running this code multiple times will show output similar to this:

Output
# request 1
{
  "origin": "185.123.101.84:51432"
}
# request 2
{
    "origin": "79.110.52.96:36721"
}

Congratulations! The different IP addresses in the output confirm that your urllib3 requests are successfully routed through ZenRows' residential proxy network. Your HTTP client is now using premium proxies that substantially reduce the likelihood of being blocked during web scraping.

Best Practice: Environment Variables with Urllib3

Environment variables are system-level variables that store configuration information, accessible by the operating system and applications running on the system. They're helpful to have a secure and efficient way to manage sensitive data because they can be set outside of the application code, keeping confidential information separate from the codebase.

To set an environment variable in Windows, open your command prompt or terminal and run a command following this structure:

Terminal
setx VARIABLE_NAME VARIABLE_VALUE

You can do the same for Linux using the command below.

Terminal
export VARIABLE_NAME=variable_value

For example, to set your API key, you can use ZENROWS_API_KEY as the variable name and replace <YOUR_ZENROWS_API_KEY> with your actual ZenRows API key.

Terminal
setx API_KEY <YOUR_ZENROWS_API_KEY>

You can now access the API key from the environment module using the os module. 

scraper.py
import urllib3
import os
 
url = "https://www.g2.com/"
apikey = os.environ.get("<YOUR_ZENROWS_API_KEY>")
params = {
    "url": url,
    "apikey": apikey,
    "js_render": "true",
    "premium_proxy": "true",
}
 
# Create a urllib3 PoolManager
http = urllib3.PoolManager()
 
# Encode the parameters and make a GET request through the ZenRows API
request_url = "https://api.zenrows.com/v1/"
response = http.request_encode_url("GET", request_url, fields=params)

# Print the response content
print(response.data)
 

Conclusion

Urllib3 proxy enables you to route your requests through different IP addresses to reduce your chances of getting blocked while web scraping. We saw that free proxies are unreliable, and you should use premium residential proxies for better results.

In any case, consider ZenRows, the all-in-one solution for reliably bypassing all anti-bot measures at any scale. Sign up now to try it for free.

Ready to get started?

Up to 1,000 URLs for free are waiting for you