Using Proxies With Cloudscraper: 2024 Guide

Idowu Omisola
Idowu Omisola
September 20, 2024 · 3 min read

As web security evolves, scraping using open-source tools like Cloudscraper without extra fortifications becomes near impossible. You will be denied access to your desired data, and your IP address may be banned.

To avoid this, you must boost your Cloudscaper-based scraper with extra features, such as proxies.

This guide will walk you through all the steps needed to configure a proxy in Cloudscraper, as well as a tutorial on using rotating and premium proxies.

Let's go!

1. Set Up a Proxy With Cloudscraper

Setting up a Cloudscraper proxy is straightforward. You only need to pass your proxy details as parameters using the proxies attribute of the request method.

First, import the required library and call the create_scraper() function. This function returns a Cloudscraper instance, similar to the requests.session object of the popular Requests library.

Example
# pip3 install cloudscraper
import cloudscraper

# create CloudScraper instance
scraper = cloudscraper.create_scraper()

Next, define your proxy settings and use the proxies attribute to pass your proxy details as parameters in your request.

Example
#...

# define your proxy
proxy = {
   'http': 'http://43.133.59.220:3128',
   'https': 'https://43.133.59.220:3128'
}

# make a request using the proxy
response = scraper.get('https://httpbin.io/ip', proxies=proxy)

For illustrative purposes, this example makes a request to HTTPBin using a proxy from the Free Proxy List.

To verify your configuration works, combine the code snippets above and add a print statement to get the final code.

Example
# pip3 install cloudscraper
import cloudscraper

# create CloudScraper instance
scraper = cloudscraper.create_scraper()

# define your proxy
proxy = {
   'http': 'http://43.133.59.220:3128',
   'https': 'http://43.133.59.220:3128'
}

# make a request using the proxy
response = scraper.get('https://httpbin.io/ip', proxies=proxy)
print(response.text)
Example
# make a request using the proxy
response = scraper.get('https://httpbin.io/ip', proxies=proxy, verify=False)

If done correctly, your result will be the proxy's IP address.

Output
{
  "origin": "43.133.59.220:13152"
}

Congratulations! You've successfully added a proxy to your Cloudscraper scraper.

2. Authenticate Proxies

Some proxy providers require authentication details, such as username and password, to regulate access to their proxy servers.

To authenticate a Cloudscraper proxy, include the required credentials in the proxy URL. Here's the format for username and password authentication.

Example
<PROXY_PROTOCOL>://<YOUR_USERNAME>:<YOUR_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>

Here's a code example:

Example
import cloudscraper

# create CloudScraper instance
scraper = cloudscraper.create_scraper()  

# define your proxy
proxy = {
   'http': 'http://<PROXY_USERNAME>:<PROXY_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>',
   'https': 'http://<PROXY_USERNAME>:<PROXY_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>'
}

# make a request using the proxy
response = scraper.get('https://httpbin.io/ip', proxies=proxy)
print(response.text)

3. Rotate Proxies

While configuring a Cloudscraper proxy helps you hide your IP address, websites can still block or ban your proxy. This can occur due to excessive requests originating from the same IP address. Anti-bot systems often flag such activities as suspicious, leading to 403 Forbidden errors with Cloudscraper or IP bans.

That's why it's essential to rotate between proxies. You'll be able to distribute traffic across multiple IP addresses, making your requests appear to originate from different users.

Follow the steps below to rotate proxies in Cloudscraper.

First, define your proxy list.

Example
#...

# define a proxy list
proxies_list = [
    {'http': 'http://43.133.59.220:3128', 'https': 'http://43.133.59.220:3128'},
    {'http': 'http://73.117.183.115:80', 'https': 'http://73.117.183.115:80'},
    {'http': 'http://50.174.7.154:80', 'https': 'http://50.174.7.154:80'}      
]

We've grabbed a few IPs from the Free Proxy List. If they don't work at the time of reading, grab new ones from the same page.

After that, select a proxy at random using the random.choice() method. You'll need to import Python's random module for that.

Example
# ...
import random

# ...  

# select a proxy from the list at random
random_proxy = random.choice(proxies_list)

# ...

Lastly, route your request through the selected proxy.

Example
# ...

# make your request using the random proxy
response = scraper.get('https://httpbin.io/ip.com', proxies=random_proxy)
print(response.text)

Put everything together to test your script.

Example
import cloudscraper
import random

# create CloudScraper instance
scraper = cloudscraper.create_scraper()  

# define a proxy list
proxies_list = [
    {'http': 'http://43.133.59.220:3128', 'https': 'http://43.133.59.220:3128'},
    {'http': 'http://73.117.183.115:80', 'https': 'http://73.117.183.115:80'},
    {'http': 'http://50.174.7.154:80', 'https': 'http://50.174.7.154:80'}      
]

# select a proxy from the list at random
random_proxy = random.choice(proxies_list)

# make your request using the random proxy
response = scraper.get('https://httpbin.io/ip.com', proxies=random_proxy)
print(response.text)

If everything works, you'll get a different IP address for each request. Here are the results for three runs:

Output
# request 1
{
  "origin": "50.174.7.154::38289"
}

# request 2
{
  "origin": "73.117.183.115:29097"
}

# request 3
{
  "origin": "43.133.59.220:34932"
}

Well done!

4. Use Premium Proxies

Although we used free proxies to show you the basic configurations, they're generally unreliable and unsuitable for real-world use cases. They are usually slow, have a short life span, and websites can easily detect and block your requests.

You need premium proxies for consistent performance and to increase your chances of avoiding detection. The best ones, such as ZenRows residential proxies, will significantly increase your success rate and grant you access to heavily protected websites. ZenRows automatically rotates residential proxies under the hood, making it easy for you to disguise your web activity and fly under the radar.

To help you get started, here's a quick guide on how to use ZenRows residential proxies.

Sign up to access your dashboard. Select Residential Proxies in the left menu section and create a new proxy user. You'll be directed to the Proxy Generator page.

Proxy Generator
Click to open the image in full screen

Copy your proxy URL for use in your Cloudscraper script. ZenRows allows you to choose between the auto-rotate option and sticky sessions.

Here's the final code using ZenRows residential proxies:

Example
import cloudscraper

# create CloudScraper instance
scraper = cloudscraper.create_scraper()  

# define your proxy
proxy = {
   'http': 'http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1337',
   'https': 'http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1338'
}

# make a request using the proxy
response = scraper.get('https://httpbin.io/ip', proxies=proxy)
print(response.text)

Since ZenRows automatically rotates proxies under the hood, you'll get a different IP address for each request.

Here's the result for three runs:

Output
# request 1
{
  "origin": "72.27.89.127:49528"
}

# request 2
{
  "origin": "5.14.169.145:59636"
}

# request 3
{
  "origin": "73.234.147.78:41632"
}

However, you should know that proxies alone are not always enough. This is because advanced anti-bot systems employ evolving techniques that can still flag your scraper and block your request.

In such cases, use the ZenRows web scraping API. It comes with the same subscription plan as the premium proxy service!

This web scraping API provides everything you need to bypass any anti-bot system, regardless of complexity. Therefore, if the previous approach fails, the ZenRows web scraping API will definitely bypass Cloudflare and can completely replace Cloudscraper.

Conclusion

Configuring premium proxies can increase your chances of avoiding detection and IP-based restrictions. To get started, you can choose the right fit for your project from this list of the best residential proxies.

However, proxies are not foolproof and can fail against sophisticated anti-bot protection. The only surefire way to never get blocked by Cloudflare is to use a web scraping API, such as ZenRows.

Ready to get started?

Up to 1,000 URLs for free are waiting for you