How to Use a Proxy With Python Requests in 2024

August 9, 2024 ยท 7 min read

Are you getting blocked while web scraping? There's one easy solution you can try: use proxies with Python Requests to hide your IP and bypass restrictions. A proxy server can increase your chances of accessing the data you need.

In this tutorial, you'll learn how to do it step by step. Let's roll!

How to Use Proxies With Python Requests

In this section, you'll learn how to perform basic Python requests with a proxy, where to get proxies from, how to authenticate, and some other day-to-day mechanisms.

1. Perform Requests

Before we begin, make sure you have Python and the Requests library installed. And since following this tutorial will be easier if you know the fundamentals of web scraping with Python, feel free to check our guide.

To use proxies with Python Requests, start by importing the HTTP client library:

program.py
import requests

Then, get some valid proxies from Free Proxy List and define a dictionary with the proxy URLs associated with the HTTP and HTTPS protocols:

program.py
proxies = {
   'http': 'http://103.167.135.111:80',
   'https': 'http://116.98.229.237:10003'
}

requests will perform HTTP requests over the http proxy and handle HTTPS traffic over the https one.

As you can see above, we used this syntax:

Example
<PROXY_PROTOCOL>://<PROXY_IP_ADDRESS>:<PROXY_PORT>

Now, perform an HTTP request with Python requests through the proxy server:

program.py
# target website
url = 'https://httpbin.io/ip'

# making an HTTP GET request through a proxy
response = requests.get(url, proxies=proxies)

The full code of what your basic Python requests proxy script will look like this:

program.py
import requests

proxies = {
   'http': 'http://103.167.135.111:80',
   'https': 'http://116.98.229.237:10003'
}

url = 'https://httpbin.io/ip'
response = requests.get(url, proxies=proxies)
print(response)

Verify it works. You'll get the following response:

Output
<Response [200]>

That means the target server website responded with an HTTP 200 status code. In other words, the HTTP request was successful! ๐Ÿฅณ

Please note that requests supports only HTTP and HTTPS proxies. If you must route HTTP, HTTPS, FTP, or other traffic, you'll need a SOCKS proxy. The library doesn't support it natively, but you can install the socks extension.

Terminal
pip3 install requests[socks]

Then, you can specify a SOCKS proxy use.

program.py
import requests

proxies = {
    'http': 'socks5://<PROXY_IP_ADDRESS>:<PROXY_PORT>',
    'https': 'socks5://<PROXY_IP_ADDRESS>:<PROXY_PORT>'
}

url = 'https://httpbin.io/ip'
response = requests.get(url, proxies=proxies)
print(response)
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

HTTPBin, our target page, returns the IP of the caller in JSON format, so retrieve the response to the request with the json() method.

Example
print(response.json())

In case of a non-JSON response, use this other method:

Example
print(response.text)

Your Python Requests proxy script should look as follows:

program.py
import requests

proxies = {
   'http': 'http://103.167.135.111:80',
   'https': 'http://116.98.229.237:10003'
}

url = 'https://httpbin.io/ip'
response = requests.get(url, proxies=proxies)
print(response.json())

Run it, and you'll get an output similar to this:

Output
{'origin': '116.98.229.237'}

The origin field contains the IP of the proxy, not yours. That confirms requests made the HTTP request over a proxy.

3. Requests Methods

The Python Requests library provides methods that correspond to different HTTP methods. Let's explore the most common ones and their uses.

GET

The GET method is used to retrieve data from a server. It sends a request to a specified URL and returns the server's response. Here's how to use it:

Example
response = requests.get('https://httpbin.io/ip')

POST

The POST method is used to send data to a server to create or update a resource. Here's an example:

Example
response = requests.post('https://httpbin.io/anything', data={"key1": "a", "key2": "b"})

Other Methods

While GET and POST are the most commonly used, Requests support several other HTTP methods that are used less frequently. Here's a table of these additional methods:

Method Syntax Used to
PUT requests.put(url, data=update_data) Update an existing resource on the server
PATCH requests.patch(url, data=partial_update_data) Partially update a resource on the server
DELETE requests.delete(url) Delete a resource on the server
HEAD requests.head(url) Retrieve the headers of a resource
OPTIONS requests.options(url) Retrieve the supported HTTP methods for a URL
Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

4. Proxy Authentication With Python Requests: Username & Password

Some proxy servers are protected by authentication for security reasons so that only users with credentials can access them. That usually happens with premium proxies or commercial solutions.

Follow this syntax to specify a username and password in the URL of an authenticated proxy:

Example
<PROXY_PROTOCOL>://<YOUR_USERNAME>:<YOUR_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>

See an example:

program.py
# ...

proxies = {
  'http': 'http://fgrlkbxt:[email protected]:7492',
  'https': 'https://fgrlkbxt:[email protected]:6286'
}

# ...

Error 407: Proxy Authentication Required

The HTTP status error code 407: Proxy Authentication Required occurs when making a request through a proxy server that requires authentication. This error indicates that the user didn't provide valid credentials.

To fix it, ensure the proxy URL contains the correct username and password. Find out more about the several types of authentications supported.

5. Proxy Session Using Python Requests

When making many requests through a proxy server, you may need a session. A Session object can reuse the same TCP connection for several requests, which saves time and improves performance compared to making single requests.

A session retains information across requests, including cookies, authentication, and connection pooling. This persistence can be particularly useful when you need to maintain a logged-in state, handle complex authentication flows, or optimize performance for multiple requests to the same host.

For instance, when scraping a website that requires login, a session can help you stay authenticated throughout your scraping process without repeatedly sending login credentials.

Use a proxy session in Python Requests as shown here:

program.py
import requests

# initialize a session
session = requests.Session()

# set the proxies in the session object
session.proxies = {
   'http': 'http://103.167.135.111:80',
   'https': 'http://116.98.229.237:10003'
}

url = 'https://httpbin.io/ip'

# perform an HTTP GET request over the session
response = session.get(url)

6. Environment Variable for a Python Requests Proxy

You can DRY up some code if your Python script uses the same proxies for each request. By default, Requests relies on the HTTP proxy configuration defined by these environment variables:

  • HTTP_PROXY: It corresponds to the http key of the proxies dictionary.
  • HTTPS_PROXY: It corresponds to the https key of the proxies dictionary.

Open the terminal and set the two environment variables this way:

Terminal
export HTTP_PROXY="http://103.167.135.111:80"
export HTTPS_PROXY="http://116.98.229.237:10003"

Then, remove the proxy logic from your script, and you'll get to this:

program.py
import requests

url = 'https://httpbin.io/ip'
response = requests.get(url)

Great! You now know the basics of proxies in Python with Requests! Let's see them in action in some more advanced scenarios.

Use a Rotating Proxy With Python Requests

When your script sends numerous requests in rapid succession, websites may flag this behavior as suspicious and block your IP address. A rotating proxy strategy can help prevent this. The concept is simple: you switch to a new proxy after a set time interval or number of requests, making each request appear to come from a different user.

Let's explore how to implement an effective proxy rotator in Python using the Requests library.

How to Rotate IPs With Requests? Free Solution

Websites can detect and block requests from specific IP addresses, potentially halting your scraping operation. By rotating IPs, your scraper becomes much more resilient, as each request appears to come from a different user, making detection and blocking more difficult.

Let's implement a simple proxy rotator using free proxies. We'll start by importing the necessary libraries: requests for making HTTP requests and random for selecting proxies randomly.

Example
import requests
import random

Next, define a list of free proxies using the Free Proxy List website. In a real-world scenario, you'd want to replace these with actual working proxies.

Example
# ...

PROXY_LIST = [
    "http://203.24.108.161:80",
    "http://80.48.119.28:8080",
    "http://203.30.189.47:80",
]

Create a function called get_random_proxy(). This function uses Python's random.choice() to randomly select and return a proxy from the list.

Example
# ...

def get_random_proxy():
    return random.choice(PROXY_LIST)

proxy = get_random_proxy()

Now comes the crucial part: making a request using the selected proxy. Use requests.get() to send a GET request to the target URL, passing the selected proxy in the proxies parameter. This tells the Requests library to route the request through the chosen proxy.

Example
# ...

url = "https://httpbin.io/ip"
response = requests.get(url, proxies={"http": proxy, "https": proxy})
print(response.text)

Merge the above snippets. Here's how your final proxy rotator code should look like:

program.py
import requests
import random

# list of free proxies (replace with actual working proxies)
PROXY_LIST = [
    "http://203.24.108.161:80",
    "http://80.48.119.28:8080",
    "http://203.30.189.47:80",
]

# randomly select a proxy from the list
def get_random_proxy():
    return random.choice(PROXY_LIST)

# get a random proxy
proxy = get_random_proxy()

# target url to check our IP
url = "https://httpbin.io/ip"

# make a request using the selected proxy
response = requests.get(url, proxies={"http": proxy, "https": proxy})

# print the response, which should show the proxy's IP
print(response.text)

Here's the result of running the above code three times:

Output
# request 1
{
  "origin": "203.30.189.47"
}

# request 2
{
  "origin": "80.48.119.28"
}

# request 3
{
  "origin": "203.24.108.161"
}

By running this script multiple times, you'll observe different IP addresses being used, demonstrating the IP rotation in action.

However, it's important to note that free proxies often have limitations. They can be unreliable, slow, or may not work with more sophisticated websites. For serious scraping tasks, consider using premium proxy services that offer better reliability, speed, and features like automatic rotation and geolocation targeting.

Ignore SSL Certificate

By default, Requests verifies SSL certificates on HTTPS requests. Certification verification can lead to SSLError errors when dealing with proxies.

To avoid those errors, disable SSL verification with verify=False:

program.py
# ...
response = requests.request(
    http_method, 
    url, 
    proxies=proxies, 
    timeout=5
    # disable SSL certificate verification
    verify=False
)

Premium Proxy to Avoid Getting Blocked

While free proxies can be useful for basic tasks, they often fail with serious web scraping projects. Using premium proxies that offer auto-rotation, residential IPs, and geolocation features significantly increases your scraping success rate.

One example of a premium proxy service that offers these features is ZenRows. ZenRows provides a proxy rotator solution that automatically handles IP rotation, offers residential IPs, and includes advanced anti-bot bypassing techniques.

To get started, sign up for ZenRows.

Go to the ZenRows Proxy Generator page. Your premium residential proxy will be generated automatically. Customize it according to your requirements.

generate residential proxies with zenrows
Click to open the image in full screen

Once you're done, copy the generated proxy and modify your code as follows:

Example
import requests

proxy = 'http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1337'
proxies = { 
    'http': proxy, 
    'https': proxy
}

url = 'https://httpbin.io/ip'
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)

Note that verify=False is mandatory when using premium proxies in ZenRows.

Run the code. Here's an example of what the output would look like:

Output
{
  "origin": "185.220.101.34"
}

This output shows that your request was successfully routed through one of ZenRows' premium proxies.

Congrats! Your premium proxy with Python Requests script is ready!

Conclusion

This step-by-step tutorial covered the most important lessons about proxies with Requests in Python. You started from the basic setup and have become a proxy master!

Now you know:

  • What a web proxy is and why free proxies aren't reliable.
  • The basics of using a proxy with Requests in Python.
  • How to implement a rotating proxy.
  • How to use a premium proxy.

Proxies help you bypass anti-bot systems. Yet, some proxies are more reliable than others, and ZenRows offers the best on the market. With it, you'll get access to a reliable rotating proxy system.

Frequent Questions

What Proxy Types Are There? Which Are the Best?

There are several types of proxies available, each with different levels of effectiveness for web scraping and protection against anti-bot solutions. Not all proxies are created equal, and choosing the right type can significantly impact your scraping success. Here's a breakdown of the main types:

  • Residential proxies: These use IP addresses assigned by Internet Service Providers to homeowners, making them appear as genuine user traffic and less likely to be blocked.
  • Datacenter proxies: These are IP addresses from secondary corporations and datacenters, offering high speeds but are more easily detected as non-residential traffic.
  • Mobile proxies: These use IP addresses from mobile devices and cellular networks, providing a high level of legitimacy and are excellent for accessing mobile-specific content.
  • Public proxies: These are free, publicly available proxies that anyone can use, but they're often slow, unreliable, and pose security risks.
  • Premium proxies: These are high-quality, paid proxy services that offer a mix of proxy types, often with additional features like automatic rotation and anti-detection mechanisms.

For the best results in web scraping, it's crucial to choose proxies from trusted providers. Learn more about selecting the right proxy for your needs in our guide on web scraping proxies.

What Are the Benefits of Using a Proxy for Web Scraping?

Using proxies for web scraping offers several significant advantages:

  • Avoiding anti-bot systems: Proxies help you rotate IP addresses, making it harder for websites to detect and block your scraping activities. This is crucial for maintaining consistent access to your target sites.
  • Geolocation targeting: Proxies allow you to appear to be accessing a website from different geographic locations. This is essential for gathering location-specific data or accessing geo-restricted content.
  • Anonymity: Proxies mask your real IP address, providing a layer of privacy and making it difficult for websites to trace requests back to your actual location or identity.
  • Better performance: By distributing requests across multiple IP addresses, proxies can help you scrape at a higher volume without overloading any single IP. This can lead to faster data collection and reduced risk of rate limiting.

Ready to get started?

Up to 1,000 URLs for free are waiting for you