How to Use a Proxy With Splash in 2026

April 30, 2024 · 8 min read

Table of contents

How to set proxy with Splash
- Prerequisites
- Set a Splash request argument
- Use a Lua script with Splash
- Use proxy profiles
Avoid getting blocked
Conclusion

Do you want to mask your request with a proxy to avoid detection and IP bans while scraping with Splash?

This tutorial will show you the three main methods of configuring a proxy in Splash, no matter if you're using Splash independently or pairing it with Scrapy:

Option 1: Set a Splash request argument.
Option 2: Set a proxy using a Lua script with Splash.
Option 3: Use proxy profiles.

How to Set Your Proxy With Splash

Splash is a runtime server with a dedicated API for interacting with the Lua script during web scraping. It lets you run the script in any programming language via an HTTP request, and execute JavaScript directly inside Lua for extracting dynamic content.

Proxy setup in Splash depends on the use case and can be divided into the following categories:

For Scrapy integration.
For independent use with Lua (two methods).

In this section, you'll learn three ways of setting up a proxy in Splash. In each case, you'll request https://httpbin.io/ip, a website that returns your current IP address.

Note

You'll use free proxies from the Free Proxy List. These free proxies are only suitable for learning and may not work at the time of reading due to their short lifespan. Feel free to exchange them for new ones from the list.

Premium residential proxies to avoid getting blocked.

Access all the data you need with ZenRows' residential proxy network.

Try for Free

Prerequisites: If You Don't Have a Running Splash Server

If you don't have a running Splash server, set one up before proceeding to the proxy setup.

Ensure you've installed the latest version of Docker on your machine. Then, pull the Splash image with the following command:

                    Terminal
                
docker pull scrapinghub/splash

Copied!

Include the sudo command for Linux OS:

                    Terminal
                
sudo docker pull scrapinghub/splash

Copied!

Once the image is pulled, run the Docker image on a specific port:

                    Terminal
                
docker run -it -p 8050:8050 --rm scrapinghub/splash

Copied!

If you're on Linux:

                    Terminal
                
sudo docker run -it -p 8050:8050 --rm scrapinghub/splash

Copied!

The command above will start the Splash server at `http://localhost:8050`. You're now ready to set up your proxy.

Option 1: Set a Splash Request Argument

The Splash request argument is the best option if you're using Scrapy and Splash. It involves adding the proxy address as an argument inside the Splash request instance.

First, ensure you install Scrapy Splash using pip:

                    Terminal
                
pip install scrapy-splash

Copied!

Initialize a Scrapy project if you've not done so already:

                    Terminal
                
scrapy startproject scraper

Copied!

Then, configure your Scrapy project to use the Splash server if you haven't already. To do that, paste the following code into your Scrapy settings file:

                    settings.py
                
# set the Splash local server endpoint
SPLASH_URL = "http://localhost:8050"

# enable the Splash downloader middleware and 
# give it a higher priority than HttpCompressionMiddleware
DOWNLOADER_MIDDLEWARES = {
    "scrapy_splash.SplashCookiesMiddleware": 723,
    "scrapy_splash.SplashMiddleware": 725,
    "scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware": 810,
}

# enable the Splash deduplication argument filter to
# make Scrapy Splash saves spice disk on cached requests
SPIDER_MIDDLEWARES = {
    "scrapy_splash.SplashDeduplicateArgsMiddleware": 100,
}

# set the Splash deduplication class
DUPEFILTER_CLASS = "scrapy_splash.SplashAwareDupeFilter"

  
  

  
Copied!

Next, import Scrapy Splash into your spider file and point the scraper class to the target URL.

                    spider.py
                
# import the required libraries
import scrapy
from scrapy_splash import SplashRequest

class Scraper(scrapy.Spider):
    name = "scraper"

    # point to the target URL
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

Copied!

Extend that class with a Lua script inside a multi-line comment. This script takes an instruction to access and print the target website's HTML:

https://gist.github.com/idowupremz/c2a264466e5944e4471e56ad17cd8b55

Now, initiate a request to the target URL with the Splash Request object. Point to the Lua script and include the proxy address in the args dictionary:

                    spider.py
                
class Scraper(scrapy.Spider):
    
    # ...

    def start_requests(self):
        
        # launch a Splash request and spcify the proxy address inside the args
        for url in self.start_urls:
            yield SplashRequest(
                url, 
                self.parse,
                endpoint="execute",
                args={
                    "wait": 0.5, 
                    "lua_source": self.lua_script, 
                    "proxy": "http://189.240.60.171:9090"
                },
                cache_args=["lua_source"]
            )

  
  

  
Copied!

Finally, log a decoded format of the HTML result from the Lua script inside the parse method:

                    spider.py
                
class Scraper(scrapy.Spider):

    # ... 
    
    def parse(self, response):

        # get the HTML result from Lua
        splash_result = response.body

        # log the HTML result to view the current IP address
        self.logger.info("Splash Result: %s", splash_result.decode("utf-8"))

Copied!

Here's what your full code looks like after combining the snippets:

                    spider.py
                
# import the required libraries
import scrapy
from scrapy_splash import SplashRequest

class Scraper(scrapy.Spider):
    name = "scraper"

    # point to the target URL
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    # add a Lua script to access the website and print its HTML
    lua_script = """
        function main(splash, args)
            assert(splash:go(args.url))
            assert(splash:wait(0.5))
            return {
                html = splash:html()
            }
        end
    """

    def start_requests(self):

        # launch a Splash request and spcify the proxy address inside the args
        for url in self.start_urls:
            yield SplashRequest(
                url, 
                self.parse,
                endpoint="execute",
                args={
                    "wait": 0.5, 
                    "lua_source": self.lua_script, 
                    "proxy": "http://189.240.60.171:9090"
                },
                cache_args=["lua_source"]
            )

    def parse(self, response):

        # get the HTML result from Lua
        splash_result = response.body

        # log the HTML result to view the current IP address
        self.logger.info("Splash Result: %s", splash_result.decode("utf-8"))

  
  

  
Copied!

Run your spider with the crawl command:

                    Terminal
                
scrapy crawl scraper

Copied!

Running the code twice outputs similar IP addresses (with different ports) from the specified proxy:

                    Output
                
{
  "origin": "189.240.60.168:9882"
}

{
  "origin": "189.240.60.168:3718"
}
```

Copied!

You've just implemented a proxy using Splash in your Scrapy web scraper. High five!

Let's go through the other proxy setup options.

Option 2: Set a Proxy Using a Lua Script With Splash

Setting up a proxy inside the Lua script is a good method if you're using Splash independently and want to integrate its API into other programming languages like Python and JavaScript.

Splash operates a dedicated server that can execute the Lua script for advanced proxy configuration and JavaScript rendering tasks.

For this tutorial, you'll run the Splash server locally on your machine and use it to execute Lua with Python's Requests library.

First, ensure you install the Requests library if you've not done so already:

                    Terminal
                
pip install requests

Copied!

Next, import the Requests library and write your Lua script inside a multi-line string. The script contains a function that starts with the proxy address setup. It visits the target URL (https://httpbin.io/ip) and returns its HTML content:

                    scraper.py
                
# import the required library
import requests

# develop your Lua script
lua_script = """
    function main(splash, args)
        
        -- set up proxy
        splash:on_request(function(request)
        
            request:set_proxy{
                type = "HTTP",
                host = "189.240.60.171",
                port = 9090,
            }
        end)
        
        -- visit the target URL
        assert(splash:go(args.url))
        assert(splash:wait(0.5))
        
        -- print the HTML content
        return {
            html = splash:html(),
        }
        
    end
"""

  
  

  
Copied!

Send a POST request to the Splash execute endpoint, specifying the target URL and the Lua script variable in the request body. Then print the JSON response to return the website's HTML:

                    scraper.py
                
# ...

response = requests.post(
    # specify the Splash server API endpoint
    "http://localhost:8050/execute", 

    # define the request body
    json={
        "lua_source": lua_script,
        "url": "https://httpbin.io/ip",
        "timeout": 60,
    }
)

# get the response
print(response.json())

  
  

  
Copied!

Combine both snippets. Your final code should look like this:

                    scraper.py
                
# import the required library
import requests

# develop your Lua script
lua_script = """
    function main(splash, args)
        
        -- set up proxy
        splash:on_request(function(request)
        
            request:set_proxy{
                type = "HTTP",
                host = "189.240.60.171",
                port = 9090,
            }
        end)
        
        -- visit the target URL
        assert(splash:go(args.url))
        assert(splash:wait(0.5))
        
        -- print the HTML content
        return {
            html = splash:html(),
        }
        
    end
"""
response = requests.post(
    # specify the Splash server API endpoint
    "http://localhost:8050/execute", 

    # define the request body
    json={
        "lua_source": lua_script,
        "url": "https://httpbin.io/ip",
        "timeout": 60,
    }
)

# get the response
print(response.json())

  
  

  
Copied!

The specified proxy returns the following IP addresses (different ports) for two manual requests:

                    Output
                
"origin": "189.240.60.168:24900"

"origin": "189.240.60.168:14962"

Copied!

You now know how to add a proxy directly in Splash using the Lua script. But there's one more scalable way to achieve this. Let's take a look.

Option 3: Use Proxy Profiles

The proxy profiles option works best if you need to share one proxy between several scraping scripts or projects. To use it, you'll have to expose the folder containing your proxy profiles to the Splash API.

First, create a new folder inside your project directory and give it a descriptive name (let’s name it "proxy-profile"). Make a profile.ini file inside this folder and configure it with your proxy details, as shown:

                    profile.ini
                
[proxy]

host=189.240.60.171
port=9090
type=HTTP

Copied!

The next step is to connect this local proxy profile folder with the default directory recognized by the Splash server. The Splash server reads the proxy detail from the following default directory:

                    Example
                
/etc/splash/proxy-profiles

Copied!

Stop the running Splash server image inside Docker. Then, restart it with the following command, replacing path\_to\_your\_proxy\_profile with the full path to your proxy profile configuration:

                    Terminal
                
docker run -p 8050:8050 -v <path_to_your_proxy_profile>:/etc/splash/proxy-profiles scrapinghub/splash

Copied!

For example, assume you've written your profile.ini file inside D:/scraper/proxy-profile. Include that path in your Docker runner command like this:

                    Terminal
                
docker run -p 8050:8050 -v D:/scraper/proxy-profile:/etc/splash/proxy-profiles scrapinghub/splash

Copied!

You've now started the Docker server image with your proxy profiles. Awesome!

Now, let's create the Lua script to test this integration. Open your Python file, write Lua code to visit the target website (https://httpbin.io/ip) and get the HTML content:

                    scraper.py
                
# import the required libraries
import requests

# develop your Lua script
lua_script = """
    function main(splash, args)
        
        -- visit the target URL
        assert(splash:go(args.url))
        assert(splash:wait(1.0))
        -- print the HTML content
        return {
            html = splash:html(),
        }  
    end
"""

  
  

  
Copied!

Send a POST request to the Splash server API and point to the file containing your proxy profile inside the request body. Ensure you use profile without the .ini extension. Then, print the JSON result to show the extracted HTML content:

                    scraper.py
                
# ...

response = requests.post(
    # specify the Splash server API endpoint
    "http://localhost:8050/execute", 
    
    # define the request body and specify the proxy profile
    json={
        "lua_source": lua_script,
        "url": "https://httpbin.io/ip",
        "timeout": 60,
        "proxy":"profile"
    }
)

# get the response
print(response.json())

  
  

  
Copied!

Your full code should look like this after combining the snippets:

                    scraper.py
                
# import the required libraries
import requests

# develop your Lua script
lua_script = """
    function main(splash, args)
        
        -- visit the target URL
        assert(splash:go(args.url))
        assert(splash:wait(1.0))
        -- print the HTML content
        return {
            html = splash:html(),
        }  
    end
"""

response = requests.post(
    # specify the Splash server API endpoint
    "http://localhost:8050/execute", 
    
    # define the request body and specify the proxy profile
    json={
        "lua_source": lua_script,
        "url": "https://httpbin.io/ip",
        "timeout": 60,
        "proxy":"profile"
    }
)

# get the response
print(response.json())

  
  

  
Copied!

Running the code twice returns the following IP addresses from the proxy:

                    Output
                
"origin": "189.240.60.168:4769"

"origin": "189.240.60.168:17136"

Copied!

That's it! Your Splash web scraper now uses the specified proxy profiles.

In the examples above, you've used free proxies. But for real projects, you'll need a premium web scraping proxy with authentication credentials.

Premium Proxy to Avoid Getting Blocked

Free proxies present major challenges for dynamic web scraping operations. Their unstable performance, security risks, and frequent blocking patterns make them impractical for scraping JavaScript-heavy websites. Target sites quickly detect and block these free proxies, disrupting your data extraction workflows.

Premium proxies deliver a more reliable solution for avoiding detection. With high-quality IPs and advanced rotation capabilities, premium proxies can effectively handle your scraping needs. Features like intelligent routing and geolocation targeting dramatically improve your success rate when scraping complex websites.

ZenRows' Residential Proxies stands out as a premium solution, offering access to 55M+ residential IPs across 185+ countries. With features like dynamic IP rotation, intelligent proxy selection, and flexible geo-targeting, all backed by 99.9% uptime, it's perfect for reliable scraping with Scrapy and Splash.

Let's integrate ZenRows' Residential Proxies with Splash middleware for Scrapy.

First, sign up and visit the Proxy Generator dashboard. Your proxy credentials will be generated automatically.

generate residential proxies with zenrows — Click to open the image in full screen

Copy your proxy credentials and use them in the following code:

                    scraper.py
                
# import the required libraries
import scrapy
from scrapy_splash import SplashRequest


class Scraper(scrapy.Spider):
    name = "scraper"

    # point to the target URL
    allowed_domains = ["httpbin.io"]
    start_urls = ["https://httpbin.io/ip"]

    # add a Lua script to access the website and print its HTML
    lua_script = """
        function main(splash, args)
            assert(splash:go(args.url))
            assert(splash:wait(0.5))
            return {
                html = splash:html()
            }
        end
    """

    def start_requests(self):

        # launch a Splash request and specify the proxy address inside the args
        for url in self.start_urls:
            yield SplashRequest(
                url,
                self.parse,
                endpoint="execute",
                args={
                    "wait": 0.5,
                    "lua_source": self.lua_script,
                    "proxy": "http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1337",
                },
                cache_args=["lua_source"],
            )

    def parse(self, response):

        # get the HTML result from Lua
        splash_result = response.body

        # log the HTML result to view the current IP address
        self.logger.info("Splash Result: %s", splash_result.decode("utf-8"))

  
  

  
Copied!

When you run this spider multiple times, you'll see output like this:

                    Output
                
// request 1
{
  "origin": "185.220.101.34:43215"
}
// request 2
{
  "origin": "45.155.68.129:8118"
}

  
  

  
Copied!

Perfect! The different IP addresses confirm that your Splash requests are successfully routing through ZenRows' residential proxy network.

Your Scrapy spider is now equipped with premium proxies that significantly reduce the risk of detection when scraping JavaScript-rendered content.

Conclusion

In this article, you've learned how to configure your Splash web scraper to use a proxy in three ways:

Setting up a proxy with Scrapy Splash using the Splash request method.
Adding a proxy directly to the Lua script in Splash.
Creating a proxy profile and running the Splash server image with the profile.

Proxies offer a fair level of anti-bot bypass capability. Still, they may prove too weak against advanced detection mechanisms. But with reliable solutions like ZenRows, you can retrieve the data you need. Try ZenRows for free!