How to Use a Proxy with Scrapy in 2025

Rubén del Campo
Rubén del Campo
Updated: February 26, 2025 · 4 min read

Have you ever been blocked from scraping your desired content due to an IP ban or geo-restrictions? There's a solution. Proxies let you mask your original IP, rotate addresses to avoid detection, and access geo-restricted data.

In this step-by-step tutorial, you'll learn everything you need to know about setting up a Scrapy proxy to improve your web scraping success.

How to Set up a Proxy With Scrapy

You can set up a proxy in Scrapy by adding a meta parameter to your request on the fly or using a custom middleware. Let's explore both options.

To see how each method works, you'll grab free proxies from the Free Proxy List website and confirm if it works by requesting HTTPBin's IP endpoint.

Now, let's begin with the meta parameter option.

Method 1: Add a Meta Parameter

This method involves passing your proxy address as a meta parameter in the scrapy.Request() method.

Once you have your proxy address and port number, pass them into your Scrapy request, as shown in the following code block.

Example
yield scrapy.Request(
    url=url,
    callback=self.parse,
    meta={"proxy": "http://66.191.31.158:80"},
)

Here's what you get by updating your spider:

scraper.py
# pip3 install scrapy
import scrapy

class ScraperSpider(scrapy.Spider):
    name = "scraper"
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
                meta={"proxy": "http://66.191.31.158:80"},
            )

    def parse(self, response):
        print(response.text)

The spider returns the proxy's IP address, as shown:

Output
{
  "origin": "66.191.31.158:38710"
}

Nice! You've successfully integrated a proxy in Scrapy. Next, you'll see how the custom middleware option works.

Premium residential proxies to avoid getting blocked.
Access all the data you need with ZenRows' residential proxy network.
Try for Free

Method 2: Create a Custom Middleware

Scrapy middleware is an intermediary layer for processing, modifying, and filtering requests and responses before they reach the spider. The middleware proxy option offers an excellent way to manage multiple spiders, as you can manipulate proxy credentials without modifying your code.

Open the middlewares.py file and create a new CustomProxyMiddleware class.

The class uses the process_request method to check if a request already has a proxy in its meta attribute. If not, it assigns one from the get_proxy method, which obtains the proxy address (PROXY_ADDRESS) from settings.py. Otherwise, it uses the default proxy if no proxy address is specified in settings.py:

scraper.py
class CustomProxyMiddleware(object):
    def __init__(self):
        self.proxy = None

    def process_request(self, request, spider):
        if "proxy" not in request.meta:
            request.meta["proxy"] = self.get_proxy(spider.crawler)

    def get_proxy(self, crawler):
        self.proxy = crawler.settings.get("PROXY_ADDRESS")
        return self.proxy

Go to settings.py and add your middleware to the DOWNLOADER_MIDDLEWARE options. Then, include a PROXY_ADDRESS variable and assign your proxy address to it:

settings.py
# ...

DOWNLOADER_MIDDLEWARES = {
    "myproject.middlewares.CustomProxyMiddleware": 350,
    # ...,
}

# include your proxy address
PROXY_ADDRESS = "http://66.191.31.158:80"

You can also add middleware at the spider level using custom settings:

scraper.py
class ScraperSpider(scrapy.Spider):
    # ...

    custom_settings = {
        "DOWNLOADER_MIDDLEWARES": {
            "myproject.middlewares.CustomProxyMiddleware": 350,
        },
    # include your proxy address
    PROXY_ADDRESS = "http://66.191.31.158:80"

    }

You can now run the spider without the meta parameter. The modified spider looks like this:

scraper.py
# pip3 install scrapy
import scrapy

class ScraperSpider(scrapy.Spider):
    name = "scraper"
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
            )

    def parse(self, response):
        print(response.text)

You'll see that the code returns the IP address of the proxy you set inside settings.py:

Output
{
  "origin": "66.191.31.158:38710"
}

That works! You've mastered how to create a custom proxy middleware in Scrapy.

Both the meta and middleware approaches yield the same results. However, this tutorial will focus on the first technique, which is more straightforward.

Proxy Authentication in Scrapy

Paid proxies often require authentication credentials, such as a username and password, which are usually part of the proxy address.

An authenticated proxy address takes the following format:

Example
<PROXY_PROTOCOL>://<YOUR_USERNAME>:<YOUR_PASSWORD>@<PROXY_ADDRESS>:<PROXY_PORT>

Adding an authenticated proxy requires the same approach as the unauthenticated ones. You only need to add the proxy address containing your credentials as a meta parameter, as shown:

scraper.py
# pip3 install scrapy
import scrapy

class ScraperSpider(scrapy.Spider):
    name = "scraper"
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
                meta={
                    "proxy": "http://<YOUR_USERNAME>:<YOUR_PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>"
                },
            )

    def parse(self, response):
        print(response.text)

That was easy! Your Scrapy spider can now use an authenticated proxy.

How to Use Rotating Proxies With Scrapy

Websites often flag excessive requests from a single IP address as suspicious and may block or ban it. So, you can still get blocked if you stick to a single proxy, especially for multiple requests.

You can avoid IP bans by distributing traffic over several IPs using proxy rotation. Proxy rotation lets you change your IP address per request, so the website treats each request as a different user.

This technique is handy for bypassing detection methods like Cloudflare's rate limiting during large-scale scraping.

To rotate proxies in Scrapy, you'll use the scrapy-rotating-proxy third-party middleware. First, install the package using pip:

Terminal
pip3 install scrapy-rotating-proxies 

Grab more free proxies from the previous website (Free Proxy List). Create a new rotating_proxy_list.txt file in your project root folder (at the same level as scrapy.cfg) and list your proxy addresses in that file:

rotating_proxy_list.txt
http://23.247.137.142:80
http://91.92.155.207:3128
http://8.215.108.194:7777
http://34.199.10.221:8081
# ...

Enable the middleware by adding it to the DOWNLOADER_MIDDLEWARE settings in the settings.py file. Since this middleware handles single and multiple proxies, you can replace the previous custom middleware with it. Finally, specify the proxy_list.txt path:

settings.py
# ...

DOWNLOADER_MIDDLEWARES = {
    "rotating_proxies.middlewares.RotatingProxyMiddleware": 350,
    # ...
}

# specify the rotating proxy list path
ROTATING_PROXY_LIST_PATH = "<PATH_TO>/proxy_list.txt"

Now, send a request to HTTPBin without the meta parameter:

scraper.py
# pip3 install scrapy scrapy-rotating-proxies
import scrapy

class ScraperSpider(scrapy.Spider):
    name = "scraper"
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
            )

    def parse(self, response):
        print(response.text)

The spider will now use random proxies from rotating_proxy_list.txt. Here's a sample result for three consecutive requests:

Output
# request 1
{
  "origin": "91.92.155.207:77628"
}

# request 2
{
  "origin": "34.199.10.221:36563"
}

# request 3
{
  "origin": "23.247.137.142:45731"
}

Congratulations! You now know how to rotate proxies in Scrapy to avoid getting blocked while scraping.

However, free proxies are only suitable for testing and not for real-life projects. Premium proxies, on the other hand, offer high uptime, and most services have an automatic proxy rotation feature.

Premium Proxy to Avoid Getting Blocked

Free proxies present significant challenges like rapid blocking, unstable performance, low IP reputation and security concerns, making them unsuitable for professional web scraping operations.

A premium proxy solution delivers reliable protection against blocking and detection. Using services with automated IP rotation and geo-targeting capabilities can dramatically improve your scraping effectiveness.

ZenRows' Residential Proxies, the best premium proxy service, offers a residential proxy network featuring over 55M+ IPs spanning 185+ countries. It's the best solution to power reliable scraping operations with features like dynamic IP rotation, intelligent proxy selection, geo-targeting, enterprise-grade uptime, and more.

Let's integrate ZenRows' Residential Proxies with Scrapy.

First, sign up for an account and access the Proxy Generator dashboard.

generate residential proxies with zenrows
Click to open the image in full screen

This will provide your essential credentials (username, password) and proxy server details (proxy domain and proxy port). Replace the placeholders with your generated credentials:

scraper.py
import scrapy

class ScraperSpider(scrapy.Spider):
    name = "scraper"

    start_urls = ["https://httpbin.io/ip"]

    def start_requests(self):
        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
                meta={"proxy": "http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1337"},
            )

    def parse(self, response):
        self.logger.info(f"{response.text}")

When you run this spider multiple times, you'll see output similar to this:

Output
# request 1
{
  "origin": "69.244.164.205:37774"
}

# request 2
{
    "origin": "77.242.86.124:53890"
}

Excellent! The results show your Scrapy requests are being routed through the ZenRows proxy network.

Each request displays a unique IP address, confirming that the automatic rotation system functions appropriately. Your spider is now protected by high-quality proxies that significantly reduce the risk of detection.

Conclusion

For any data extraction project, you'll need to get around detection mechanisms, and a Scrapy proxy plays a key role. By routing your requests through it, you can hide your IP address and avoid getting blocked.

Now, you know how to set proxies in Scrapy. However, as free proxies are often unreliable, you should consider a reliable solution like ZenRows. Try ZenRows for free!

Ready to get started?

Up to 1,000 URLs for free are waiting for you