The Anti-bot Solution to Scrape Everything? Get Your Free API Key! ๐Ÿ˜Ž

CloudUnflare for Real IP Address Reconnaissance

July 10, 2023 ยท 3 min read

Many websites use Cloudflare to detect and block bots, including web scrapers. However, you can use CloudUnflare to bypass Cloudflare and get the data you want.

Let's see how!

What Is CloudUnflare

CloudUnflare is an open-source reconnaissance tool for uncovering the IP address of a target domain's origin server that's behind Cloudflare's network.

When Cloudflare protects a website, the actual IP address of the web server is hidden. Yet, CloudUnflare provides the real IP and lets you directly access the web server to fulfill your scraping needs.

How CloudUnflare Works

While CloudUnflare hasn't disclosed its whole inner modus operandi, we can grasp it from its reconnaissance report. Generally, it leverages diverse techniques to gather information and then analyzes this data to uncover the real IP address behind Cloudflare.

When you pass a domain name to the tool, it starts by checking the associated subdomains. Next, it proceeds to look for any CNAME records, which are DNS entries that map one domain name to another (like an alias).ย 

Additionally, it uses the CompleteDNS API to get historical information about the domain's name servers. By combining these techniques and more, CloudUnflare can uncover the actual IP address of a Cloudflare-protected website.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

How to Use CloudUnflare

CloudUnflare is a Bash script designed for Linux systems. So, if you want to run this tool on Windows, you need a Linux environment.

To get started, create and verify an account on CompleteDNS to explore DNS records.

Then, install the required dependencies (cURL, dig, and WHOIS) with the following command:

Terminal
apt-get install curl dnsutils whois -y

After that, install CloudUnflare by cloning its GitHub repository.

Terminal
git clone https://github.com/greycatz/CloudUnflare.git

Then navigate to the CloudUnflare directory and list its content using the ls command. You should have something like this:

Is Command
Click to open the image in full screen

Open cloudunflare.bash, locate the CompletDNS_Login variable and edit its value (you'll need your CompleteDNS' API credentials for this). Then, save your edit and exit the text editor. To access this variable directly from your terminal, use the following command:

Terminal
nano cloudflare.bash
Terminal
bash cloudunflare.bash
Prompt to enter target domain
Click to open the image in full screen

Enter the one you want to scrape and wait for the tool to uncover the actual IP address you're after. For this example, let's investigate a Cloudflare-protected website: g2.com. Here's what we got:

Result
Click to open the image in full screen

The result above shows that CloudUnflare encountered an error due to the unavailability of "NS History by CompleteDNS". After running checks on the subdomains, the tool could only provide the IP addresses of the subdomains, some of which belong to Cloudflare.

That probably happens because CloudUnflare is no longer maintained, particularly the viewDNS.info platform responsible for the IP history data.ย 

To be also noted, this web scraping method is unreliable even if the tool still works because it only delivers results in a small amount of cases.

So let's see what's the best CloudUnflare alternative next.

Best CloudUnflare Alternative

ZenRows is a web scraping API that offers a reliable solution to bypass Cloudflare and all other anti-bot systems. You can skip the tedious process of discovering the origin server's IP address and, instead, retrieve the data you want with a single API call.

To use ZenRows, create a free account. You'll get to the Request Builder, select your favorite language and enter your target domain's URL (e.g., https://www.g2.com/). As a usual recommendation, check the boxes for "Anti-bot", "Premium Proxy", and "JavaScript Rendering". That will give you the code to use.

ZenRows Dashboard
Click to open the image in full screen

Now, install an HTTP library to make a request. We'll use Python Requests.

Terminal
pip install requests

Copy the code from ZenRows and run it in your IDE.

program.py
import requests

url = 'https://www.g2.com/'
apikey = 'Your API Key'
params = {
    'url': url,
    'apikey': apikey,
	'js_render': 'true',
	'antibot': 'true',
	'premium_proxy': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)

This should be your output:

Output
//..
<title>Business Software and Services Reviews | G2</title>
//..

Bingo! You've bypassed your first Cloudflare-protected website.

Conclusion

While reconnaissance tools can provide information about a target domain, using it as a scraping approach is tedious and unreliable. Furthermore, we saw CloudUnflare is no longer maintained. Fortunately, you can use ZenRows as a more effective and scalable alternative.

Did you find the content helpful? Spread the word and share it on Twitter, or LinkedIn.

Frustrated that your web scrapers are blocked once and again? ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

The easiest way to do Web Scraping

From Rotating Proxies and Headless Browsers to CAPTCHAs, a single API call to ZenRows handles all anti-bot bypass for you.