API Documentation
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything"
curl -L -x "http://YOUR_KEY:@proxy.zenrows.com:8001" -k "https://httpbin.org/anything"
URL required
The URL is the page you want to scrape. It needs to be encoded when calling the API, no need for Proxy.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything"
curl -L -x "http://YOUR_KEY:@proxy.zenrows.com:8001" -k "https://httpbin.org/anything"
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&antibot=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%5D"
curl -L -x "http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001" -k "https://httpbin.org/anything"
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait_for=.content"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait=10000"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&block_resources=image%2Cmedia%2Cfont"
curl -L -x "http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001" -k "https://httpbin.org/anything"
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&json_response=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&window_width=1920&window_height=1080"
curl -L -x "http://YOUR_KEY:[email protected]s.com:8001" -k "https://httpbin.org/anything"
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true&proxy_country=us"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
curl -H "Referrer: https://www.google.com" "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&custom_headers=true"
curl -H "Referrer: https://www.google.com" -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&session_id=12345"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&device=desktop"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&original_status=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://httpbin.org/anything"
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%7D"
curl -L -x "http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001" -k "https://httpbin.org/anything"
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fwww.amazon.com%2Fdp%2FB01LD5GO7I%2F&autoparse=true"
curl -L -x "http://YOUR_KEY:[email protected]:8001" -k "https://www.amazon.com/dp/B01LD5GO7I/"
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
curl "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything" -X "POST" --data "key1=value1&key2=value2"
curl -L -x "http://YOUR_KEY:@proxy.zenrows.com:8001" -k "https://httpbin.org/anything" -X "POST" --data "key1=value1&key2=value2"
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
curl "https://api.zenrows.com/v1/usage?apikey=YOUR_KEY"
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
response = client.get(url)
print(response.text)
URL required
The URL is the page you want to scrape. It needs to be encoded.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
response = client.get(url)
print(response.text)
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true"
}
response = client.get(url, params=params)
print(response.text)
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'antibot': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"antibot": "true"
}
response = client.get(url, params=params)
print(response.text)
JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'js_instructions': '[{"click":".button-selector"}]',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"js_instructions": "[{\"click\":\".button-selector\"}]"
}
response = client.get(url, params=params)
print(response.text)
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it. Tools like requests will do it for you if passed as parameters.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'wait_for': '.content',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"wait_for": ".content"
}
response = client.get(url, params=params)
print(response.text)
Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'wait': '10000',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"wait": 10000
}
response = client.get(url, params=params)
print(response.text)
Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'block_resources': 'image,media,font',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"block_resources": "image,media,font"
}
response = client.get(url, params=params)
print(response.text)
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'json_response': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"json_response": "true"
}
response = client.get(url, params=params)
print(response.text)
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'js_render': 'true',
'window_width': '1920',
'window_height': '1080',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]s.com:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"js_render": "true",
"window_width": 1920,
"window_height": 1080
}
response = client.get(url, params=params)
print(response.text)
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'premium_proxy': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"premium_proxy": "true"
}
response = client.get(url, params=params)
print(response.text)
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'premium_proxy': 'true',
'proxy_country': 'us',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"premium_proxy": "true",
"proxy_country": "us"
}
response = client.get(url, params=params)
print(response.text)
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'custom_headers': 'true',
}
headers = {
'Referrer': 'https://www.google.com',
}
response = requests.get('https://api.zenrows.com/v1/', params=params, headers=headers)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
headers = {
"Referrer": "https://www.google.com",
}
response = requests.get(url, proxies=proxies, verify=False, headers=headers)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
headers = {
"Referrer": "https://www.google.com",
}
response = client.get(url, headers=headers)
print(response.text)
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'session_id': '12345',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"session_id": 12345
}
response = client.get(url, params=params)
print(response.text)
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'device': 'desktop',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"device": "desktop"
}
response = client.get(url, params=params)
print(response.text)
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'original_status': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"original_status": "true"
}
response = client.get(url, params=params)
print(response.text)
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'css_extractor': '{"links":"a @href", "images":"img @src"}',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
params = {
"css_extractor": "{\"links\":\"a @href\", \"images\":\"img @src\"}"
}
response = client.get(url, params=params)
print(response.text)
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
# pip install requests
import requests
url = 'https://www.amazon.com/dp/B01LD5GO7I/'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
'autoparse': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
# pip install requests
import requests
url = "https://www.amazon.com/dp/B01LD5GO7I/"
proxy = "http://YOUR_KEY:[email protected]:8001"
proxies = {"http": proxy, "https": proxy}
response = requests.get(url, proxies=proxies, verify=False)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://www.amazon.com/dp/B01LD5GO7I/"
params = {
"autoparse": "true"
}
response = client.get(url, params=params)
print(response.text)
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
# pip install requests
import requests
url = 'https://httpbin.org/anything'
apikey = 'YOUR_KEY'
params = {
'url': url,
'apikey': apikey,
}
data = {
'key1': 'value1',
'key2': 'value2',
}
response = requests.post('https://api.zenrows.com/v1/', params=params, data=data)
print(response.text)
# pip install requests
import requests
url = "https://httpbin.org/anything"
proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001"
proxies = {"http": proxy, "https": proxy}
data = {
"key1": "value1",
"key2": "value2",
}
response = requests.post(url, proxies=proxies, verify=False, data=data)
print(response.text)
# pip install zenrows
from zenrows import ZenRowsClient
client = ZenRowsClient("YOUR_KEY")
url = "https://httpbin.org/anything"
data = {
"key1": "value1",
"key2": "value2",
}
response = client.post(url, data=data)
print(response.text)
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
# pip install requests
import requests
response = requests.get('https://api.zenrows.com/v1/usage?apikey=YOUR_KEY')
print(response.text)
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, []);
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
URL required
The URL is the page you want to scrape. It needs to be encoded.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, []);
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'antibot': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"antibot": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'js_instructions': '[{"click":".button-selector"}]',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"js_instructions": "[{\"click\":\".button-selector\"}]"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it. Tools like axios will do it for you if passed as parameters.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'wait_for': '.content',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"wait_for": ".content"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'wait': '10000',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"wait": 10000
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'block_resources': 'image,media,font',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"block_resources": "image,media,font"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'json_response': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"json_response": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'js_render': 'true',
'window_width': '1920',
'window_height': '1080',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]s.com:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"js_render": "true",
"window_width": 1920,
"window_height": 1080
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'premium_proxy': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"premium_proxy": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'premium_proxy': 'true',
'proxy_country': 'us',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"premium_proxy": "true",
"proxy_country": "us"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
headers: {
'Referrer': 'https://www.google.com',
},
params: {
'url': url,
'apikey': apikey,
'custom_headers': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
headers: {
"Referrer": "https://www.google.com",
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
const headers = {
"Referrer": "https://www.google.com",
};
try {
const { data } = await client.get(url, {
"custom_headers": "true"
}, { headers });
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'session_id': '12345',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"session_id": 12345
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'device': 'desktop',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"device": "desktop"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'original_status': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"original_status": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'css_extractor': '{"links":"a @href", "images":"img @src"}',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
try {
const { data } = await client.get(url, {
"css_extractor": "{\"links\":\"a @href\", \"images\":\"img @src\"}"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
// npm install axios
const axios = require('axios');
const url = 'https://www.amazon.com/dp/B01LD5GO7I/';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'GET',
params: {
'url': url,
'apikey': apikey,
'autoparse': 'true',
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://www.amazon.com/dp/B01LD5GO7I/";
const proxy = "http://YOUR_KEY:[email protected]:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "GET",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://www.amazon.com/dp/B01LD5GO7I/";
try {
const { data } = await client.get(url, {
"autoparse": "true"
});
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
// npm install axios
const axios = require('axios');
const url = 'https://httpbin.org/anything';
const apikey = 'YOUR_KEY';
axios({
url: 'https://api.zenrows.com/v1/',
method: 'POST',
data: 'key1=value1&key2=value2',
params: {
'url': url,
'apikey': apikey,
},
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install axios http-proxy-agent https-proxy-agent
const axios = require("axios");
const HttpProxyAgent = require("http-proxy-agent");
const HttpsProxyAgent = require("https-proxy-agent");
const url = "https://httpbin.org/anything";
const proxy = "http://YOUR_KEY:@proxy.zenrows.com:8001";
const httpAgent = new HttpProxyAgent(proxy);
const httpsAgent = new HttpsProxyAgent(proxy);
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
axios({
url,
httpAgent,
httpsAgent,
method: "POST",
data: "key1=value1&key2=value2",
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
// npm install zenrows
const { ZenRows } = require("zenrows");
(async () => {
const client = new ZenRows("YOUR_KEY");
const url = "https://httpbin.org/anything";
const postData = "key1=value1&key2=value2";
try {
const { data } = await client.post(url, [], { data: postData });
console.log(data);
} catch (error) {
console.error(error.message);
if (error.response) {
console.error(error.response.data);
}
}
})();
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
// npm install axios
const axios = require('axios');
axios({
url: 'https://api.zenrows.com/v1/usage?apikey=YOUR_KEY',
method: 'GET',
})
.then(response => console.log(response.data))
.catch(error => console.log(error));
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:@proxy.zenrows.com:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
URL required
The URL is the page you want to scrape. It needs to be encoded.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:@proxy.zenrows.com:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&antibot=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%5D";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait_for=.content";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait=10000";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&block_resources=image%2Cmedia%2Cfont";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&json_response=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&window_width=1920&window_height=1080";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]s.com:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true&proxy_country=us";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&custom_headers=true";
String response = Request.get(apiUrl)
.addHeader("Referrer", "https://www.google.com")
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Referrer", "https://www.google.com")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&session_id=12345";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&device=desktop";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&original_status=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%7D";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://httpbin.org/anything")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fwww.amazon.com%2Fdp%2FB01LD5GO7I%2F&autoparse=true";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:[email protected]:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.get("https://www.amazon.com/dp/B01LD5GO7I/")
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.client5.http.fluent.Form;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything";
String response = Request.post(apiUrl)
.bodyForm(Form.form()
.add("key1", "value1")
.add("key2", "value2")
.build())
.execute().returnContent().asString();
System.out.println(response);
}
}
import java.net.URI;
import org.apache.hc.client5.http.fluent.Form;
import java.security.cert.X509Certificate;
import java.util.Base64;
import javax.net.ssl.*;
import org.apache.hc.client5.http.fluent.Request;
import org.apache.hc.core5.http.HttpHost;
public class ProxyRequest {
public static void main(final String... args) throws Exception {
ignoreCertWarning();
URI uri = new URI("http://YOUR_KEY:@proxy.zenrows.com:8001");
String basicAuth = new String(Base64.getEncoder().encode(uri.getUserInfo().getBytes()));
String response = Request.post("https://httpbin.org/anything")
.bodyForm(Form.form()
.add("key1", "value1")
.add("key2", "value2")
.build())
.addHeader("Proxy-Authorization", "Basic " + basicAuth)
.viaProxy(HttpHost.create(uri))
.execute().returnContent().asString();
System.out.println(response);
}
private static void ignoreCertWarning() {
SSLContext ctx = null;
TrustManager[] trustAllCerts = new X509TrustManager[] { new X509TrustManager() {
public X509Certificate[] getAcceptedIssuers() {return null;}
public void checkClientTrusted(X509Certificate[] certs, String authType) {}
public void checkServerTrusted(X509Certificate[] certs, String authType) {}
} };
try {
ctx = SSLContext.getInstance("SSL");
ctx.init(null, trustAllCerts, null);
SSLContext.setDefault(ctx);
} catch (Exception e) {}
}
}
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
import org.apache.hc.client5.http.fluent.Request;
public class APIRequest {
public static void main(final String... args) throws Exception {
String apiUrl = "https://api.zenrows.com/v1/usage?apikey=YOUR_KEY";
String response = Request.get(apiUrl)
.execute().returnContent().asString();
System.out.println(response);
}
}
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:@proxy.zenrows.com:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
URL required
The URL is the page you want to scrape. It needs to be encoded.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:@proxy.zenrows.com:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&antibot=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%5D');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait_for=.content');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait=10000');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&block_resources=image%2Cmedia%2Cfont');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&json_response=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&window_width=1920&window_height=1080');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]s.com:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true&proxy_country=us');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&custom_headers=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'Referrer: https://www.google.com',
]);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'Referrer: https://www.google.com',
]);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&session_id=12345');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&device=desktop');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&original_status=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%7D');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fwww.amazon.com%2Fdp%2FB01LD5GO7I%2F&autoparse=true');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://www.amazon.com/dp/B01LD5GO7I/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:[email protected]:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$postBody = [
'key1' => 'value1',
'key2' => 'value2',
];
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postBody));
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
<?php
$url = 'https://httpbin.org/anything';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, 'http://YOUR_KEY:@proxy.zenrows.com:8001');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$postBody = [
'key1' => 'value1',
'key2' => 'value2',
];
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postBody));
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://api.zenrows.com/v1/usage?apikey=YOUR_KEY');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
echo $response . PHP_EOL;
curl_close($ch);
?>
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:@proxy.zenrows.com:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
URL required
The URL is the page you want to scrape. It needs to be encoded.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:@proxy.zenrows.com:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
JavaScript Rendering
Some websites rely heavily on JavaScript to load content. Enable this feature if you need to extract data that are loaded dynamically.
You can enable JavaScript by adding &js_render=true to the request. This request costs 5 credits.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Anti-bot
Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle. Bear in mind that adding custom headers might overwrite our configuration. To wait for the expected content to load, combine Anti-bot with Wait For Selector feature (see next point).
Add &antibot=true to the request for this feature. This request costs 5 credits.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&antibot=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Headless: JavaScript Instructions
Interact with the page once the content is loaded. You can perform actions as a user would (i.e., click on an element), and ZenRows will execute them. Once the Instructions finish, it will return the current HTML.
Following the click example, below are the instructions to click on a ".button-selector"
element.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%5D", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:js_render=true&js_instructions=%5B%7B%22click%22%3A%22.button-selector%22%7D%[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
The original instructions are a JSON array containing the commands to run.
[{"click": ".button-selector"}]
They then need to be stringified and encoded. You can use our Builder or an online tool to encode it.
`[{"click":".button-selector"}]` // stringified
`%5B%7B%22click%22%3A%22.button-selector%22%7D%5D` // encoded
&js_instructions=[{...}] accepts an array of commands, and you can add as many as needed. ZenRows will execute them in order. Here is a summary of the actions you can run.
{"click": ".button-selector"} // Click on the first element that matches the CSS Selector
{"wait_for": ".late-selector"} // Wait for a given CSS Selector to load in the DOM
{"wait": 2000} // Wait an exact amount of time in ms
{"fill": [".input-selector", "value"]} // Fill in an input
{"check": ".checkbox-selector"} // Check a checkbox input
{"uncheck": ".checkbox-selector"} // Uncheck a checkbox input
{"select_option": [".select-selector", "option_value"]} // Select an option by its value
{"scroll_y": 1500} // Vertical scroll in pixels
{"scroll_x": 1500} // Horizontal scroll in pixels
{"evaluate": "document.body.style.backgroundColor = '#c4b5fd';"} // Execute JavaScript code
These instructions won't work inside iframes, we need another set for that. The syntax is similar but with an extra parameter to choose the iframe.
For security, iframe's content isn't returned on the response. To get than content, use frame_reveal. It will append a node with the content encoded in base64 to avoid problems with JS or HTML inyection.
{"frame_click": ["#iframe", ".button-selector"]}
{"frame_wait_for": ["#iframe", ".late-selector"]}
{"frame_fill": ["#iframe", ".input-selector", "value"]}
{"frame_check": ["#iframe", ".checkbox-selector"]}
{"frame_uncheck": ["#iframe", ".checkbox-selector"]}
{"frame_select_option": ["#iframe", ".select-selector", "option_value"]}
{"frame_evaluate": ["iframe-name", "document.body.style.backgroundColor = '#c4b5fd';"]} // won't work with selectors, will match iframe's name or URL
{"frame_reveal": "#iframe"} // will create a node with the class "iframe-content-element"
Requires javascript rendering (&js_render=true).
Visit our JavaScript Instructions guide for a detailed explanation for each action and usage examples.
Headless: Wait For Selector
Sometimes you may want to wait for a given CSS Selector to load in the DOM before ZenRows returns the content. You can get this behaviour by adding &wait_for=.background-load parameter into the request.
Requires javascript rendering (&js_render=true).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait_for=.content", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Headless: Wait Milliseconds
Some websites take a lot time to load. If you need to wait a fixed amount of time until everything is loaded, you can define the time in milliseconds with &wait=10000 parameter, which will wait 10000 milliseconds (10 seconds) before returning the HTML. The maximum wait time is 30 seconds.
Requires javascript rendering (&js_render=true).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&wait=10000", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Headless: Block Resources
Many websites load dozens of resources delaying the HTML response. You can block specific resources from loading using the &block_resources=image parameter.
ZenRows API allows to block the following resources: stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest, other. Separate by commas to block multiple resources.
ZenRows will block certain resources by default, such as stylesheets or images, to speed up your scraping. You can disable blocking by setting it to "none": block_resources=none.
Requires javascript rendering (&js_render=true).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&block_resources=image%2Cmedia%2Cfont", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:js_render=true&block_resources=image%2Cmedia%[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
JSON Response
html
and xhr
. - HTML will contain the content of the page. You'll have to decode it since it will be encoded in JSON.
- XHR will be an array with one object per performed request. Those will contain URL, body, status code and many more. See the example below.
Requires javascript rendering (&js_render=true).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&json_response=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
And the response will look like this:
{
"html": "<!DOCTYPE html><html>...</html>",
"xhr": [{
"url": "https://www.example.com/fetch",
"body": "{\"success\": true}\n",
"status_code": 200,
"method": "GET",
"headers": {
"content-encoding": "gzip",
// ...
},
"request_headers": {
"accept": "*/*",
// ...
}
}]
}
Headless: Window Width/Height
If you need to change the browser's window width and height, you can the &window_width=1920 and &window_height=1080 parameters.
Requires javascript rendering (&js_render=true).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&js_render=true&window_width=1920&window_height=1080", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]s.com:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Premium Proxies
Some websites are harder to scrape and block datacenter IPs. Premium Proxies come in handy to solve this problem. As the name suggests, these proxies come straight from ISP providers.
You can easily use Premium Proxies adding &premium_proxy=true to the request. This request costs 10 credits.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Geolocation
Some content is specific to a region. In these cases, you may want to make your request from a given country.
You only need to add &premium_proxy=true&proxy_country=us to the request. Geolocation requires Premium Proxies enabled (it costs 10-25 credits).
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&premium_proxy=true&proxy_country=us", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Custom Headers
Custom Headers come in handy when you need to add your own headers (user agents, cookies, referrer, etc.) to the request.
You can enable Custom Headers by adding &custom_headers=true to the request.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&custom_headers=true", nil)
req.Header.Add("Referrer", "https://www.google.com")
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
req.Header.Add("Referrer", "https://www.google.com")
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Session ID
Use the same IP for each API Request by using &session_id=12345. ZenRows will maintain a session for each ID for 10 minutes.
You will need to keep track of them on your side by storing each Session ID so you can reuse them.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&session_id=12345", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Device
If you require to use either desktop or mobile user agents in the headers, you can use &device=desktop or &device=mobile parameter in the request.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&device=desktop", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Original HTTP Code
ZenRows API returns HTTP Codes depending on the result of the request. If you want to return the status code provided by the website, enable &original_status=true
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&original_status=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Data Extraction: CSS Selectors
You can use CSS Selectors for data extraction. In the table below, you will find a list of examples of how to use it.
You only need to add &css_extractor={"links":"a @href"} to the request to use this feature.
Here are some examples
extraction rules | sample html | value | json output |
---|---|---|---|
{"divs":"div"} | <div>text0</div> | text | {"divs": "text0"} |
{"divs":"div"} | <div>text1</div><div>text2</div> | text | {"divs": ["text1", "text2"]} |
{"links":"a @href"} | <a href="#register">Register</a> | href attribute | {"links": "#register"} |
{"hidden":"input[type=hidden] @value"} | <input type="hidden" name="_token" value="f23g23g.b9u1bg91g.zv97" /> | value attribute | {"hidden": "f23g23g.b9u1bg91g.zv97"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"class":"button.submit @data-v"} | <button class="submit" data-v="register-user">click</button> | data-v attribute with submit class | {"class": "register-user"} |
{"emails":"a[href^='mailto:'] @href"} | <a href="mailto:[email protected]">email 1</a><a href="mailto:[email protected]">email 2</a> | href attribute for links starting with mailto: | {"emails": ["[email protected]", "[email protected]"]} |
If you are interested in learning more, you can find a complete reference of CSS Selectors here.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything&css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%7D", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:css_extractor=%7B%22links%22%3A%22a%20%40href%22%2C%20%22images%22%3A%22img%20%40src%22%[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Data Extraction: Auto Parsing
ZenRows API will return the HTML of the URL by default. Enabling Autoparse uses our extraction algorithms to parse data in JSON format automatically.
Add &autoparse=true to the request for this feature.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fwww.amazon.com%2Fdp%2FB01LD5GO7I%2F&autoparse=true", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
)
func main() {
inputUrl := "https://www.amazon.com/dp/B01LD5GO7I/"
proxy, _ := url.Parse("http://YOUR_KEY:[email protected]:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", inputUrl, nil)
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
POST / PUT Requests
Send POST / PUT requests as usual with your chosen language. ZenRows will transparently forward the data to the target site.
The return value will be the original response's content. Headers and cookies will also be part of the response. The way to access them will depend on the manner of calling.
package main
import (
"io"
"log"
"net/http"
"net/url"
"strings"
)
func main() {
form := url.Values{}
form.Set("key1", "value1")
form.Set("key2", "value2")
client := &http.Client{}
req, err := http.NewRequest("POST", "https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything", strings.NewReader(form.Encode()))
req.Header.Add("Content-Type", "application/x-www-form-urlencoded")
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
package main
import (
"crypto/tls"
"io"
"log"
"net/http"
"net/url"
"time"
"strings"
)
func main() {
form := url.Values{}
form.Set("key1", "value1")
form.Set("key2", "value2")
inputUrl := "https://httpbin.org/anything"
proxy, _ := url.Parse("http://YOUR_KEY:@proxy.zenrows.com:8001")
httpClient := &http.Client{
Timeout: 60 * time.Second,
Transport: &http.Transport{
Proxy: http.ProxyURL(proxy),
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("POST", inputUrl, strings.NewReader(form.Encode()))
req.Header.Add("Content-Type", "application/x-www-form-urlencoded")
resp, err := httpClient.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Download Files and Pictures
ZenRows will download images, PDFs or any type of file. Instead of reading the response's content as text, you can store it directly in a file.
There is a size limit and we don't recommend using ZenRows to download big files.
Credits Usage
Check credits consumption programmatically by calling the endpoint /usage. Usage calls will not count for concurrency, and results are available in real-time.
package main
import (
"io"
"log"
"net/http"
)
func main() {
client := &http.Client{}
req, err := http.NewRequest("GET", "https://api.zenrows.com/v1/usage?apikey=YOUR_KEY", nil)
resp, err := client.Do(req)
if err != nil {
log.Fatalln(err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatalln(err)
}
log.Println(string(body))
}
Overview
Here is a complete list of parameters you can use to customize your requests.
parameter | type | default | description |
---|---|---|---|
apikey required | string | Get Your Free API Key | |
url required | string | http://example.com/ | The URL you want to scrape |
js_render | boolean | false | Render the JavaScript on the page with a headless browser (5 credits/request) |
custom_headers | boolean | false | Enable custom headers to be passed to the request. |
premium_proxy | boolean | false | Use premium proxies to make the request harder to detect (10-25 credits/request) |
proxy_country | string | "" | Geolocation of the IP used to make the request. Only for Premium Proxies. |
session_id | integer | Send a Session ID number to use the same IP for each API Request for up to 10 minutes. | |
device | string | "" | Use either desktop or mobile user agents in the headers. |
original_status | boolean | false | Returns the status code provided by the website. |
wait_for | string | "" | Wait for a given CSS Selector to load in the DOM before returning the content. |
wait | integer | 0 | Wait a fixed amount of time before returning the content. |
block_resources | string | "" | Block specific resources from loading using this parameter. |
json_response | string | false | Get content in JSON including XHR or Fetch requests. |
window_width | integer | 1920 | Set browser's window width. |
window_height | integer | 1080 | Set browser's window height. |
css_extractor | string (JSON) | "" | Define CSS Selectors to extract data from the HTML. |
autoparse | boolean | false | Use our auto parser algorithm to automatically extract data. |
Getting started
- An API key
- The encoded URL you want to scrape
ZenRows offers both API and Proxy modes as a way of connection. Plus SDKs for Python and Node.js, which make things easier for newcomers. You will find examples for all of them below.
Zr-
. ZenRows will also add a header Zr-Final-Url
showing the final visited URL, which can change from the original in case of redirects. Zr-Content-Encoding: gzip
Zr-Content-Type: text/html
Zr-Cookies: _pxhd=Bq7P4CRaW1B...
Zr-Final-Url: https://www.example.com/
API Key required
To access API functionality, you need to have a valid API Key. This unique key will keep all your requests authorized.
Start using the API by creating your API Key now.
# gem install faraday
require 'faraday'
url = URI.parse('https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything')
conn = Faraday.new()
conn.options.timeout = 180
res = conn.get(url, nil, nil)
print(res.body)
# gem install faraday
require 'faraday'
url = 'https://httpbin.org/anything'
proxy = 'http://YOUR_KEY:@proxy.zenrows.com:8001'
conn = Faraday.new(proxy: proxy, ssl: {verify: false})
conn.options.timeout = 180
res = conn.get(url, nil, nil)
print(res.body)
URL required
The URL is the page you want to scrape. It needs to be encoded.
# gem install faraday
require 'faraday'
url = URI.parse('https://api.zenrows.com/v1/?apikey=YOUR_KEY&url=https%3A%2F%2Fhttpbin.org%2Fanything')
conn = Faraday.new()
conn.options.timeout = 180
res = conn.get(url, nil, nil)
print(res.body)
# gem install faraday
require 'faraday'
url = 'https://httpbin.org/anything'
proxy = 'http://YOUR_KEY:@proxy.zenrows.com:8001'
conn = Faraday.new(proxy: proxy, ssl: {verify: false})
conn.options.timeout = 180
res = conn.get(url, nil, nil)
print(res.body)