Getting Started

ZenRows makes scraping a breeze and integrates easily in any development flow you might have. We offer both API and Proxy options for connection.

We offer a complete set of features. For more advanced options, such as JavaScript Rendering or Premium Proxies, check out the documentation.

You’ll need an API Key to be able to connect, register here or go and copy yours.

You will only need this API Key and a target URL for simple requests, which must be encoded.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL"

# for example (won't work without a valid API Key)
curl "https://api.zenrows.com/v1/?apikey=abcdefghij0123456789&url=https%3A%2F%2Fhttpbin.org%2Fanything"

Encoding URLs

It would be best to encode target URLs to avoid a problem with ZenRows API and your URLs’ possible search parameters. We will decode it on our side, and there should be no problem.

Let’s take a Google search example: https://www.google.com/search?q=scraping&channel=fs. If you were to send that string without encoding but with Premium Proxy param active, the call would be like this.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=https://www.google.com/search?q=scraping&channel=fs&premium_proxy=true"

As you can see, there might be confusion there with the channel and premium_proxy parameters. You can use an online tool to encode your URL, but that does not scale. Some clients (like axios or requests) do that for you, but in case they don’t, most programming languages have functions to do it.

import urllib3
urllib.parse.quote("YOUR_URL")

Once encoded, this is how the result would look like.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=https%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dscraping%26channel%3Dfs&premium_proxy=true"

What is JavaScript Rendering

Some websites rely heavily on JavaScript to load content. That makes the initial HTML unusable from a scraping point of view since only the scaffolding. All the actual content loads dynamically in subsequent calls.

Quick story short, we need to load the initial HTML (as always) but then follow all the instructions in JavaScript to call backend endpoints to populate the data. And that’s the reason we need JavaScript Rendering active for those sites.

For performance reasons, scrapers tend to avoid JS Rendering and try to load the minimum possible. But when needed, a headless browser will do the work, which is precisely what JS Rendering parameter activates. ZenRows will call the target URL with a real browser and perform all subsequent calls when sent to true, thus having access to the data that was not present initially.

Here is a sample call in cURL with JS Rendering active. Remember to encode the URL.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&js_render=true"

The good thing about loading headless browsers is that it allows some excellent features as waiting for an element to be present.

Wait for Element or Certain Time

When loading a page with JavaScript Rendering, we get the feature to wait for an element to be present.

By default, ZenRows will wait until there are no network requests. But that sometimes is not enough for whatever reason (maybe the page loads certain content a couple of seconds after the initial load). In those cases, the usual thing to do is wait for an element to be present. To do that, you’ll need to inspect the target page by hand (i.e., DevTools) and locate said element. Let’s say it is an element with a .my-test class. You’ll need to send that in the wait_for parameter.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&js_render=true&wait_for=.my-test"

Another option to wait is to set a concrete number. In cases when there isn’t a selector, or you’re not sure, you can wait for, say, 3 seconds. Do that with the wait param and send the number in milliseconds! Careful with this; wait=3 will not wait 3 seconds.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&js_render=true&wait=3000"

The API will modify its default behavior when called with either of these parameters and will wait until that class is present or the specified time.

If it still does not work after trying this, contact us and we’ll help you set up your calls.

What is Geotargeting or Regional IP

There are two primary cases when you want regional IPs: localized content and geoblocking. Amazon, for example, will show different products on its .uk and .fr sites. And others like CNN will block visits from outside the US.

To avoid problems with retailers showing different products every visit, you can send a country code in the request. That will effectively localize the request to the country you want, achieving results you can reliably replicate.

You will also need to send the Premium Proxy parameter to true to geotarget the request.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&premium_proxy=true&proxy_country=us"

List of some available countries and their code, you can check out the whole list on the builder.

  • United States: us
  • Canada: ca
  • United Kingdom: gb
  • Germany: de
  • France: fr
  • Spain: es
  • Brazil: br
  • Mexico: mx
  • India: in
  • Japan: jp
  • China: cn

What are Residential IPs

In scraping proxies, there are two main IP types: data center and residential. To summarize, data center IPs work reliably but might be listed and thus easier to block. Residential, however, are harder to stop since they belong to a provider (ISP) that might assign it to an actual user.

If not stated in the requests, ZenRows will use data center connections. But for a better success rate or accessing heavy banners like Google, you can send premium_proxy to true. Note that residential IPs come at an extra cost.

We also offer localization or geotargeting when using Residential.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&premium_proxy=true"

If the blocks continue, contact us and we’ll help you.

CSS Selectors do not work or “parser is not valid”

The main error with CSS Selector is not encoding the content properly. You can use our Builder or an online tool to encode it. Then send it as css_extractor, in the example we’ll get the contents of .my-class selector in a test property: {"test": ".my-class"}.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&css_extractor=%257B%2522test%2522%253A%2520%2522.my-class%2522%257D"

If you are still getting empty responses, try to get Plain HTML. The content might be different from the one you are getting on your browsers. Maybe they are blocking the requests or serving localized content based on another country.

To check that the selector is correct, please review the docs and test each selector in a browser (i.e., using Chrome Dev Tools’ Console: document.querySelectorAll(".my-class")).

If the HTML looks good, the selector works on the browser, but the parser does not work, contact us and we’ll help you.

What is Autoparse

ZenRows has custom parsers written for some domains that will work out-of-the-box. When calling the API, the default return type will be Plain HTML. You need to activate it with the autoparse parameter. The output in those cases will be JSON instead of HTML.

cURL
curl "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&autoparse=true"

The results will be missing or empty for sites not on the list. Try removing the autoparse parameter a try again. If you want custom scrapers built, contact us.

Builder

The Builder will automatically activate the autoparse option for the sites we offer. If you paste a URL, say https://www.amazon.com/dp/B01LD5GO7I/, you will see that Data Extraction changes from Plain HTML to Autoparse. You can disable it by clicking on Plain HTML.

Autoparse in Builder

Click on “Perform request in the browser” to see the result. ZenRows will convert the HTML into a structured result. The example case will output price, rating, category, description, and several other fields.

Output in Builder for Autoparse

How to Send Custom Headers

You can add Custom Headers to requests, but not as parameters as usual. There are several standard fields such as Accept, Cookie, Referer, or User-Agent. The format is Header: Value without quotes; see examples below.

ZenRows sends domain-tailored headers to achieve the highest possible success rate. Custom Headers will overwrite the default ones, which might cause a drop in the success rate. That might happen if you send, for example, a User-Agent that usually goes with other headers that are missing, like Sec-Ch-Ua for Chrome.

It is essential to set custom_headers to true.

cURL
curl \
-H "Accept: application/json" \
-H "Referer: https://www.google.com" \
-H "User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1" \
"https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY&url=YOUR_URL&custom_headers=true"

As stated above, the success rate might decrease due to header changes. If you find yourself there and need to send a custom header, contact us and we’ll help you.

SSL Certificate Error

If you are getting an SSL Certificate Error (i.e., SSLCertVerificationError in python), try sending the request with verification disabled.

response = requests.get('YOUR_URL', verify=False)

# If you still get warnings (not errors) but want to get rid of them:
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

If the problem continues, contact us and we’ll help you.

Using Concurrency

Concurrency refers to the number of ongoing requests that happen at any given time. By different means, computers and languages can call the API in parallel and wait for results while others are still running. You can use concurrency with any ZenRows plan; check out pricing for more details.

For more details, check our how-to guide on concurrency to see details about implementation in python and javascript.