Managing multiple Scrapy projects and spiders over the command line interface can be challenging, especially if you've deployed them over Scrapyd servers. ScrapydWeb solves this challenge by providing a web interface to manage Scrapy projects via Scrapyd API endpoints.
In this tutorial, you'll learn how to manage Scrapy projects over Scrapyd clusters using the ScrapydWeb user interface.
- What is ScrapydWeb and why use it?
- Key features of ScrapydWeb.
- Setting up ScrapydWeb.
- Other ScrapydWeb management features.
- Web scraping with ZenRows.
What Is ScrapydWeb and Why Use It?
ScrapydWeb provides a web-based interface for managing Scrapyd clusters, a set of servers for deploying Scrapy projects. Since it reads Scrapy scraping task information directly from Scrapyd, ScrapydWeb requires a running Scrapyd server to monitor and control jobs.
![Scrapyd: Step-by-Step Tutorial [2025]](https://static.zenrows.com/content/small_scrapyd_cover_f621e7c763.png)
ScrapydWeb supports real-time monitoring and multi-server management, and lets you view scraping job statistics. Overall, its UI provides access to Scrapyd's spider management API endpoints. You can execute, schedule, and cancel scraping jobs and even delete them via the ScrapydWeb interface.
Key Features of ScrapydWeb
The major features of ScrapydWeb include:
- Scheduled scraping jobs: It lets you schedule your spiders to run at a specific time and frequency.
- Multi-node Scrapyd cluster management: You can manage multiple Scrapyd servers within a single ScrapydWeb user interface.
- Mobile mode: ScrapydWeb lets you manage Scrapyd servers directly from your mobile device.
- Task monitoring: You can monitor the Scrapyd server status and real-time job performance from a single interface.
- Detailed job statistics and logs: ScrapydWeb lets you access detailed statistics, logs, and progress visualizations at the server, project, and task levels.
- Alerts: It provides detailed logs, job status, and statistics alerts via Email, Slack, or Telegram.
Setting Up ScrapydWeb
Let's now see how to set up ScrapydWeb, including the installation, configuration, and deployment steps.
Step 1: Installing Scrapyd and ScrapydWeb
ScrapydWeb requires Python 3+ to work smoothly. We recommend installing Python's latest version from the official download page before you begin.
You'll also need to install Scrapyd, Scrapyd-Client, and ScrapydWeb. Keep in mind that Scrapyd-Client is a command-line interface (CLI) tool that lets you communicate with the Scrapyd API.
Install these packages using pip
:
pip3 install scrapyd scrapyd-client scrapydweb
All done? You'll go through the deployment process in the next section.
Step 2: Deploy Scrapy Project to Scrapyd
Next, start the Scrapyd server with the following command:
scrapyd
The above command starts a Scrapyd server on port 6800
by default:
Site starting on 6800
Next, connect your Scrapy project with the Scrapyd server.
Go to the scrapy.cfg
file in your Scrapy project root folder. Replace its content with the following code. In the configuration below, the deployment location has been set to localhost. It also points to the current Scrapyd server URL to ensure the project deploys correctly to the specified running server.
[settings]
default = product_scraper.settings
[deploy:local]
url = http://localhost:6800/
project = scraper
Next, deploy the project using the Scrapyd-Client by running the following command. The <target_name>
is the current deployment environment (local, in this case) and <your_project_name>
is your Scrapy project name (e.g., scraper):
scrapyd-deploy <target_name> -p <your_project_name>
For instance, the following command deploys a Scrapy project (scraper
) locally on the running Scrapyd node:
scrapyd-deploy local -p scraper
After deployment, visit the running Scrapyd node on http://localhost:6800/
via your browser, and you'll see the deployed scraper
project listed under "Available projects":

You're now ready to monitor the deployed Scrapy project via ScrapydWeb. You'll do that in the next section.
Step 3: Manage Scrapyd Servers via ScrapydWeb
Start the ScrapydWeb server by running the following command:
scrapydweb
The above command creates a scrapydweb_settings_v11.py
file in your project root.
Open the created scrapydweb_settings_v11.py
file and scroll to the SCRAPYD_SERVERS
list. You'll see the servers being managed by ScrapydWeb listed in this section, including an authenticated default one. Feel free to comment out the authenticated default port, leaving the required port localhost:6800
:
# ...
SCRAPYD_SERVERS = [
"127.0.0.1:6800",
# ... comment or remove the default port
]
# ...
The advantage of this file is that you can add more Scrapyd servers to the SCRAPYD_SERVERS
list as you scale to other nodes. More on that later.
The default port we removed is used when you're running an authenticated Scrapyd server, which isn't applicable in this case.
Rerun the scrapydweb
command to start the ScrapydWeb server. This command starts a ScrapydWeb daemon that defaults to http://localhost:5000/
.
Visit that URL via your browser, and you'll get the following interface, listing your running Scrapyd server by default:

Bravo! You've now connected your Scrapyd server with the ScrapydWeb management interface.
If you start a second Scrapyd server, it will also appear on the Scrapyd server table.
Managing Multiple Scrapyd Servers on ScrapydWeb
Assuming you want to manage another Scrapy project called product_scraper
on a different Scrapyd server, you'll need to create a new Scrapyd daemon for it on a separate port.
To start another Scrapyd server on a different port, open the second Scrapy project you want to manage and create a scrapyd.conf
file in its root directory. Specify the new port in this directory, as shown below. The configuration below file tells Scrapyd to run on an alternative port (6802
) instead of the previous 6800
:
[scrapyd]
http_port = 6802
Then, ensure you point the new Scrapy project to this new Scrapyd port by modifying its scrapy.cfg
file:
[settings]
default = product_scraper.settings
[deploy:local]
url = http://localhost:6802/
project = product_scraper
Open your command line to the new project's root folder and run the scrapyd
command to start the Scrapyd daemon on the new port (6802
):
Next, open another command line to your project folder and run the deployment command for the new project:
scrapyd-deploy local -p product_scraper
The above command deploys the new Scrapy project to the new Scrapyd server. You can visit http://localhost:6802
to see the deployed project.
Stop the running ScrapydWeb server. Then, open the scrapydweb_settings_v11.py
file in the first Scrapy project (scraper
) and update the SCRAPYD_SERVERS
list with the new Scrapyd server connected to the new Scrapy project. The SCRAPYD_SERVERS
list becomes:
# ...
SCRAPYD_SERVERS = [
"127.0.0.1:6800",
"127.0.0.1:6802",
]
# ...
Open the ScrapydWeb URL again, and you'll see the second Scrapyd server:

From here, you can switch between Scrapyd servers and run and schedule spiders.
You'll learn how to run tasks in the next step.
Step 4: Run Individual Spiders
To run a spider within a specific Scrapyd cluster:
- Open the ScrapydWeb interface (
http://localhost:5000/
) and click "Servers" on the left sidebar. - Select the Scrapyd server containing the Scrapy project with the desired spider.
- Select the "Run Spider" tab and click Multinode Runspider.

You might get an error on the next page. Ignore it and go ahead to select your server.
- Select the desired Scrapyd server, Scrapy project, version number (go with the default option for auto-versioning), and the spider you wish to run.

- To add more functionality, such as setting the User Agent, cookies, robots.txt rule, concurrency, and delay, toggle on the "settings & arguments" switch and set your preferences.

- Scroll down and click "Check CMD." Then, click "Run Spider" to execute the spider immediately.

- You'll now see the executed job with an "ok" status on the next page.

You just executed your first scraping job on ScrapydWeb. That's great!
Let's see other ScrapydWeb features.
Other ScrapydWeb Management Features
As mentioned earlier, ScrapydWeb supports other Scrapyd server management features, including scheduling, stats, logs, and more. Let's see how scheduling and logging work.
Schedule Spiders With ScrapydWeb
Here's how to schedule a scraping job in ScrapydWeb:
- Go to "Servers" and select the Scrapyd server on which you want to schedule a spider.
- Click the "Run Spider" tab ⇒ "Multinode Run Spider".
- Follow the previous steps for selecting the target Scrapyd server, Scrapy project, version and spider.

- Toggle on the "timer task" switch and set your timer preferences.

- For more schedule options, toggle on "show more timer settings". For instance, use the "start_date" and "end_date" options to schedule the selected spider to run at intervals.

- After setting your preferences, click "Check CMD" > "Add Task" to conclude the schedule.

- This takes you to a dashboard showing current and past schedules for the selected Scrapyd server.

Well done! You now know how to schedule Scrapy scraping jobs with ScrapydWeb.
Depending on your project's requirements, you can schedule multiple spiders within the same Scrapyd server or across many Scrapyd servers.
View Spider Execution Logs and Stats
Logs and stats give you an overview of failing and successful spider runs, including the point of failure. This allows you to prioritize reruns and track update progress for scheduled tasks.
You can view spider execution logs and stats across Scrapyd clusters.
Here's how to go about it:
- Select the desired server from the server option dropdown at the top left.
- Click "Logs" on the sidebar.
- Select the project name from the logs table.

- Then, select the spider you want to view.

- On the next page, you'll see the log list with timestamps for each. Click "Log" to view the spider log or "Stats" to see the crawling statistics.

Great job! You've advanced your spider monitoring skills on ScrapydWeb.
That said, despite the scaling infrastructure that ScrapydWeb offers, your Scrapy projects can still face the problem of getting blocked by anti-bot measures. There's a solution for that in the next part of this article.
Scale Up With ZenRows
Although ScrapydWeb allows you to schedule multiple batch scraping jobs and scale across several nodes, these jobs often fail due to anti-bot protections. When that happens, you risk losing money and wasting valuable time and human effort.
The best way to prevent getting blocked by anti-bots is to use a scraping solution like the ZenRows Universal Scraper API. ZenRows matches scalability with an impressive scraping success rate of up to 99.93%, ensuring you extract data without limitations. It also has headless browser features to automate human interactions and scrape dynamic content.
ZenRows integrates easily with Scrapy via the scrapy-zenrows
middleware. This middleware brings all the functionality of the Universal Scraper API to Scrapy.
Let's see how it works by scraping a heavily protected website like the Anti-bot Challenge page.
Sign up and go to the Request Builder. Then, copy your ZenRows API key.

Install the scrapy-zenrows
middleware with pip
:
pip3 install scrapy-zenrows
Add the middleware and your ZenRows API key to your Scrapy project's settings.py
file and set ROBOTSTXT_OBEY
to False
to gain access to ZenRows' API:
# ...
ROBOTSTXT_OBEY = False
DOWNLOADER_MIDDLEWARES = {
# enable scrapy-zenrows middleware
"scrapy_zenrows.middleware.ZenRowsMiddleware": 543,
}
# ZenRows API key
ZENROWS_API_KEY = "<YOUR_ZENROWS_API_KEY>"
Import ZenRowsRequest
into your scraper
spider and add ZenRows' params to your start_requests
function, including the JS Rendering and Premium Proxy features:
# pip3 install scrapy-zenrows
import scrapy
from scrapy_zenrows import ZenRowsRequest
class Scraper(scrapy.Spider):
name = "scraper"
allowed_domains = ["www.scrapingcourse.com"]
start_urls = ["https://www.scrapingcourse.com/antibot-challenge"]
def start_requests(self):
# use ZenRowsRequest for customization
for url in self.start_urls:
yield ZenRowsRequest(
url=url,
params={
"js_render": "true",
"premium_proxy": "true",
},
callback=self.parse,
)
def parse(self, response):
self.log(response.text)
The above Scrapy spider outputs the protected website's full-page HTML, showing you bypassed the anti-bot challenge:
<html lang="en">
<head>
<!-- ... -->
<title>Antibot Challenge - ScrapingCourse.com</title>
<!-- ... -->
</head>
<body>
<!-- ... -->
<h2>
You bypassed the Antibot challenge! :D
</h2>
<!-- other content omitted for brevity -->
</body>
</html>
Congratulations! You just bypassed an anti-bot challenge using the scrapy-zenrows middleware. You can now reliably schedule scraping jobs at scale via ScrapydWeb.
Conclusion
ScrapydWeb is a valuable tool for monitoring your Scrapy projects at scale, providing a friendly user interface for running, scheduling, canceling, deleting scraping jobs, viewing logs and stats, and more.
Despite the powerful features of ScrapydWeb, anti-bots can disrupt your spider schedules, causing abrupt scraper failures that result in low or zero data yields. We recommend integrating ZenRows with your Scrapy project upfront to mitigate these challenges. ZenRows lets you scrape at any scale without limitations.