MechanicalSoup is a Python library for automating website interactions. It's built on top of BeautifulSoup and Requests, popular tools in the web scraping community.
While MechanicalSoup is effective for web scraping using Python, it doesn't prevent your scraper from getting blocked by the websites' anti-bot systems.
Fortunately, there are a few methods to boost MechanicalSoup anti-detection powers. In this tutorial, you'll learn a step-by-step process of how to set up proxies in MechanicalSoup.
Set Up a Single Proxy With MechanicalSoup
As a prerequisite, install the MechanicalSoup library if you haven't already:
pip install MechanicalSoup
Before setting up the proxy, let's make a simple HTTP GET request to https://httpbin.io/ip
. This website returns the IP address of the client making the request.
Import the mechanicalsoup
module to your code and create a browser object. Next, send a GET request to the target URL and print the response text.
import mechanicalsoup
# create a browser object
browser = mechanicalsoup.StatefulBrowser()
# send a GET request
response = browser.session.request("get", "https://httpbin.io/ip")
print(response.text)
The code will print your machine's IP address:
{
"origin": "50.173.55.144:30127"
}
Exposing your IP address is not a good idea, as websites may block it due to scraping activities.
Let's set up a proxy to reduce the chances of being detected and blocked.
Start with grabbing a free proxy from the Free Proxy List website.
The proxies used in this tutorial may not work at the time of reading. Free proxies have a short lifespan and are only suitable for learning purposes. To follow along, grab fresh proxies from the Free Proxy List.
Next, define a proxy in your code pointing to the IP address and port number (e.g., http://8.219.97.248:80
). This ensures that HTTP and HTTPS requests are routed through this proxy.
# define proxies using this syntax:
# <PROXY_PROTOCOL>://<PROXY_IP_ADDRESS>:<PROXY_PORT>
proxies = {
"https": "http://8.219.97.248:80",
"http": "http://8.219.97.248:80",
}
Finally, pass the proxies
dictionary to the browser object you defined before. Here's what your code should look like:
import mechanicalsoup
# define proxies using this syntax:
# <PROXY_PROTOCOL>://<PROXY_IP_ADDRESS>:<PROXY_PORT>
proxies = {
"https": "http://8.219.97.248:80",
"http": "http://8.219.97.248:80",
}
# create a browser object
browser = mechanicalsoup.StatefulBrowser()
# send a GET request
response = browser.session.request("get", "https://httpbin.io/ip", proxies=proxies)
print(response.text)
The code will output the IP address of the used proxy server:
{
"origin": "8.219.64.236:60924"
}
Congrats! You've just changed the IP address of your MechanicalSoup scraper. Let's move to more advanced concepts.
Proxy Authentication
Some proxy servers require authentication to grant access only to users with valid credentials. It's usually the case with commercial solutions or premium proxies.
Here's the syntax to specify credentials (username and password) for an authenticated proxy:
<PROXY_PROTOCOL>://<USERNAME>:<PASSWORD>@<PROXY_IP_ADDRESS>:<PROXY_PORT>
This is what your updated code with proxy authentication should look like:
import mechanicalsoup
# define proxies
proxies = {
"https": "http://<YOUR_USERNAME>:<YOUR_PASSWORD>@72.10.160.173:3985",
"http": "http://<YOUR_USERNAME>:<YOUR_PASSWORD>@72.10.160.173:3985",
}
# create a browser object
browser = mechanicalsoup.StatefulBrowser()
# send a GET request
response = browser.session.request("get", "https://httpbin.io/ip", proxies=proxies)
print(response.text)
Add Rotating and Premium Proxies to MechanicalSoup
If you make multiple requests in a short period using a single proxy, the websites you're trying to access can detect this behavior and block you.
To avoid getting blocked, you can use a rotating proxy. This means changing proxies after a certain amount of time or number of requests, making you appear as a different user each time.
Create a list of proxies using the same Free Proxy List website:
# create a list of proxies
PROXIES = [
"http://8.219.97.248:80",
"http://148.72.140.24:30127",
# ...
"http://77.238.235.219:8080"
]
Next, create a function that randomly selects the proxies from the list and returns a dictionary object. You can use the random.choice() method for this.
# ...
import random
# ...
# function to randomly select and return proxies
def rotate_proxy():
https_proxy = random.choice(PROXIES)
http_proxy = random.choice(PROXIES)
return {
"https": https_proxy,
"http": http_proxy
}
# ...
# rotate proxies
proxies = rotate_proxy()
# ...
Here's your final rotating proxy code:
import mechanicalsoup
import random
# create a list of proxies
PROXIES = [
"http://8.219.97.248:80",
"http://148.72.140.24:30127",
# ...
"http://77.238.235.219:8080"
]
# function to randomly select and return proxies
def rotate_proxy():
https_proxy = random.choice(PROXIES)
http_proxy = random.choice(PROXIES)
return {
"https": https_proxy,
"http": http_proxy
}
# create a browser object
browser = mechanicalsoup.StatefulBrowser()
# rotate proxies
proxies = rotate_proxy()
# send a GET request
response = browser.session.request("get", "https://httpbin.io/ip", proxies=proxies)
print(response.text)
Each time you run this code, the script randomly picks a proxy from the list.
# request 1
{
"origin": "8.219.64.236:64632"
}
# request 2
{
"origin": "77.238.235.219:8080"
}
# request 3
{
"origin": "148.72.140.24:30127"
}
Congratulations! You've successfully implemented the rotating proxies functionality.
Premium Proxy to Avoid Getting Blocked
Free proxies create significant challenges for web scraping. Their poor performance, security concerns, and frequent blocking patterns make them unreliable for professional scraping tasks. Most websites detect and block these free proxies, disrupting your data collection efforts.
Premium proxies provide a more reliable solution for avoiding detection. With high-quality IPs and advanced rotation capabilities, premium proxies can effectively handle scraping at any scale. Features like smart routing and geo-location targeting substantially improve your scraping success rate.
ZenRows' Residential Proxies emerges as a premium solution, offering access to 55M+ residential IPs across 185+ countries. With features like dynamic IP rotation, intelligent proxy selection, and flexible geo-targeting, all backed by 99.9% uptime, it's perfect for reliable web scraping with MechanicalSoup.
Let's integrate ZenRows' Residential Proxies with MechanicalSoup.
Sign up and visit the Proxy Generator dashboard. Your proxy credentials will be generated automatically.

Copy your proxy credentials and use them in this Python code:
import mechanicalsoup
# define proxies
proxies = {
"http": "http://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1337",
"https": "https://<ZENROWS_PROXY_USERNAME>:<ZENROWS_PROXY_PASSWORD>@superproxy.zenrows.com:1338",
}
# create a browser object
browser = mechanicalsoup.StatefulBrowser()
# send a GET request
response = browser.session.request("get", "https://httpbin.io/ip", proxies=proxies)
print(response.text)
Here's the output after running this script two times:
// request 1
{
"origin": "185.123.101.84:51432"
}
// request 2
{
"origin": "79.110.52.96:36721"
}
Congratulations! The different IP addresses confirm that your MechanicalSoup requests are successfully routed through ZenRows' residential proxy network. Your code is now equipped with premium proxies that significantly reduce the risk of detection during web scraping.
Conclusion
This step-by-step tutorial showed how to set up a proxy in MechanicalSoup.
Now you know:
- The basics of setting a proxy with MechanicalSoup in Python.
- How to deal with proxy authentication.
- How to use a rotating proxy.
- How to implement a premium proxy and bypass anti-bot systems.
Using ZenRows, you can bypass any anti-bot protection and increase the reliability of your scraper. Try ZenRows for free!