Dart Web Scraping: Tutorial 2024

May 21, 2024 · 12 min read

Considering Dart for your next web scraping project? Good idea. The language's CLI scripting capabilities and intuitive syntax make it an excellent tool for the job.

This step-by-step tutorial will guide you through building a complete web scraping script in Dart, using http, html, and puppeteer.

Let's dive in!

Is Dart Good for Web Scraping?

Yes, you can scrape data from web pages with Dart!

When it comes to web scraping, most developers consider Python or JavaScript to be a no-brainer for their huge communities. While Dart may not be the best language for web scraping, it's still a fantastic choice for at least three reasons:

  1. It's a rising language developed and endorsed by Google.
  2. It has an intuitive, concise, and easy-to-understand syntax, which is excellent for scripting.
  3. It features a complete standard API and several high-quality external libraries for web development.

Thanks to its ease of use and rich ecosystem, Dart is more than just a viable option for web scraping!

Prerequisites

Prepare your Dart environment for web scraping with the http and html packages.

Install Dart

To use Dart locally, you need to install the Dart SDK. The recommended installation procedure on the official site is using a package manager.

On Windows, install Dart through Chocolatey with this command on an elevated terminal:

Terminal
choco install dart-sdk

The procedure is slightly longer on macOS and Linux. For more details, follow the official installation guide.

Run the command below to make sure Dart is working:

Terminal
dart --version

This should produce the following output:

Output
Dart SDK version: 3.3.3 (stable) (Tue Mar 26 14:21:33 2024 +0000) on "windows_x64"

Awesome! Dart is ready to use.

Create Your Dart Project

Launch the dart create command to initialize a Dart CLI project called web_scraper:

Output
dart create web_scraper

The web_scraper will now contain your Dart web scraping project.

Load your project in a Dart IDE. Visual Studio Code with the Dart extension will be a great choice.

Take a look at the web_scraper.dart file in the /bin folder:

web_scraper.dart
import 'package:web_scraper/web_scraper.dart' as web_scraper;

void main(List<String> arguments) {
  print('Hello world: ${web_scraper.calculate()}!');
}

This is the main file in your Dart project. As you can see, it imports a package from the web_scraper.dart file in the /lib folder. Open it, and you’ll see:

web_scraper.dart
int calculate() {
  return 6 * 7;
}

In the Dart project:

  • /bin is the folder for the public entry points to compile to executable binaries.
  • /lib is the folder that contains all the rest of the code.

So, the web_scraper.dart file in the /lib folder will contain the scraping logic. Then, the web_scraper.dart file in the /bin folder will import and run it.

Run the Dart application using this command:

Terminal
dart run

The result will be some logs and then the desired result:

Output
Hello world: 42!

Well done! Follow the next section and turn this project into an application for web scraping in Dart.

Frustrated that your web scrapers are blocked once and again?
ZenRows API handles rotating proxies and headless browsers for you.
Try for FREE

How to Do Web Scraping With Dart

In this guided section, you’ll build a Dart scraper to extract all product data from a site. The scraping target will be ScrapeMe, an e-commerce platform with a paginated list of Pokémon products:

Demo Page
Click to open the image in full screen

Get ready to perform web scraping in Dart!

Step 1: Scrape by Requesting Your Target Page

The easiest way to connect to a web page and retrieve its HTML source code is to use an HTTP client. http is Dart's most popular HTTP client library. Add it to your project's dependencies with the following command:

Terminal
dart pub add http

Open the pubspec.yml file in the root folder of your project. Under the dependencies section, you'll see:

pubspec.yml
dependencies:
  http: ^1.2.1

Import http in the web_scraper.dart file in /lib and initialize an async function stating where to use it:

web_scraper.dart
import 'package:http/http.dart' as http;

Future scrape() async {
  // ...
}

Use the Uri.parse() to create a Uri object for your target page. Pass it to the http.get() method to make a GET request to the specified page. Then, retrieve the HTML document from the server response and print it:

web_scraper.dart
import 'package:http/http.dart' as http;

Future scrape() async {
  // create a Uri object to the target page
  final pageUri = Uri.parse('https://scrapeme.live/shop/');

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  // response and print it
  final html = response.body;
  print(html);
}

Turn main() in the /bin/web_scraper.dart file into an async function and call scrape():

/bin/web_scraper.dart
import 'package:web_scraper/web_scraper.dart' as web_scraper;

void main(List<String> arguments) async {
  await web_scraper.scrape();
}

Run your Dart web scraping script, and it’ll print:

Output
<!doctype html>
<html lang="en-GB">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=2.0">
<link rel="profile" href="http://gmpg.org/xfn/11">
<link rel="pingback" href="https://scrapeme.live/xmlrpc.php">
<!-- Omitted for brevity... -->

Fantastic! Your script can retrieve the target page. Now it's time to extract some data.

Step 2: Extract Data From One Element

To scrape data from a webpage, you must parse its HTML content with an HTML parser. html is a powerful Dart HTML parser with a rich API for DOM traversal and manipulation. Install it with this command:

Terminal
dart pub add html

Import it by adding the following line on top of web_scraper.dart in /lib:

web_scraper.dart
import 'package:html/parser.dart' as html_parser;

Next, feed the HTML content of the page to the parse() function to get Document` object. This contains all the methods required to select the nodes on the page and perform web scraping in Dart:

web_scraper.dart
final document = html_parser.parse(html);

You now need to define an effective HTML node selection strategy. The idea is to select the HTML elements of interest from the DOM and retrieve data from them. To tackle this task, you have to inspect the web page's HTML source code.

Visit the target page of your script in the browser and inspect a product HTML node with the DevTools:

DevTools Inspection
Click to open the image in full screen

Expand the HTML code. You can select the product node with the CSS selector below:

web_scraper.dart
li.product

li is the tag of the product HTML element, while product is its class.

Given a product node, you can find the following:

  • The URL in an <a> node.
  • The image URL in an <img> node.
  • The name in a <h2> node.
  • The price in a <span> node.

You have all the information you need to implement the web scraping logic. Use the querySelector() method to apply a CSS selector on the page. Then, extract from the selected nodes:

web_scraper.dart
// select the first product HTML element on the page
final productHTMLElement = document.querySelector('li.product');

// scraping logic
final url = productHTMLElement?.querySelector('a')?.attributes['href'];
final image = productHTMLElement?.querySelector('img')?.attributes['src'];
final name = productHTMLElement?.querySelector('h2')?.text;
final price = productHTMLElement?.querySelector('span')?.text;

The text attribute contains the text nested in the element. attributes returns a map with the name-value pairs for the HTML attributes in the node.

Log the scraped data with some print() instructions:

web_scraper.dart
print(url);
print(image);
print(name);
print(price);

Your web_scraper.dart file in /lib will now contain:

web_scraper.dart
import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as html_parser;

Future scrape() async {
  // create a Uri object to the target page
  final pageUri = Uri.parse('https://scrapeme.live/shop/');

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  final html = response.body;
  // parse the HTML document
  final document = html_parser.parse(html);

  // select the first product HTML element on the page
  final productHTMLElement = document.querySelector('li.product');

  // scraping logic
  final url = productHTMLElement?.querySelector('a')?.attributes['href'];
  final image = productHTMLElement?.querySelector('img')?.attributes['src'];
  final name = productHTMLElement?.querySelector('h2')?.text;
  final price = productHTMLElement?.querySelector('span')?.text;

  // print the scraped data
  print(url);
  print(image);
  print(name);
  print(price);
}

Launch it, and it'll produce this output:

Output
https://scrapeme.live/shop/Bulbasaur/
https://scrapeme.live/wp-content/uploads/2018/08/001-350x350.png
Bulbasaur
£63.00

Wonderful! The scraping logic works like a charm. Now, let's learn how to scrape all the products on the page.

Step 3: Extract Data From All Elements

Before extending the script, you need a data structure representing the scraped data.

Define a new class called Product on top of the web_scraper.dart file in /lib:

web_scraper.dart
class Product {
  String? url;
  String? image;
  String? name;
  String? price;

  Product(this.url, this.image, this.name, this.price);
}

In the scrape() function, initialize an empty list of Product objects. This is where you'll store the objects populated with the data collected from the page:

web_scraper.dart
final List<Product> products = [];

Now, use querySelectorAll() instead of querySelector() to select all product nodes. Iterate over them, apply the scraping logic, instantiate a Product object, and add it to the list:

web_scraper.dart
final productHTMLElements = document.querySelectorAll('li.product');

// iterate over the product nodes and apply
// the scraping logic
for (final productHTMLElement in productHTMLElements) {
  // scraping logic
  final url = productHTMLElement.querySelector('a')?.attributes['href'];
  final image = productHTMLElement.querySelector('img')?.attributes['src'];
  final name = productHTMLElement.querySelector('h2')?.text;
  final price = productHTMLElement.querySelector('span')?.text;

  // instantiate a Product object
  // and add it to the list
  final product = Product(url, image, name, price);
  products.add(product);
}

Print the scraped data to make sure the web scraping Dart logic works as intended:

web_scraper.dart
for (final product in products) {
  print(product.url);
  print(product.image);
  print(product.name);
  print(product.price);
  print('');
}
```

Your /lib/web_scraper.dart script will now contain:

web_scraper.dart
import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as html_parser;

// representation of the product object to
// scrape from the page
class Product {
  String? url;
  String? image;
  String? name;
  String? price;

  Product(this.url, this.image, this.name, this.price);
}

Future scrape() async {
  // create a Uri object to the target page
  final pageUri = Uri.parse('https://scrapeme.live/shop/');

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  final html = response.body;
  // parse the HTML document
  final document = html_parser.parse(html);

  // where to store the scraped data
  final List<Product> products = [];

  // select the product HTML elements on the page
  final productHTMLElements = document.querySelectorAll('li.product');

  // iterate over the product nodes and apply
  // the scraping logic
  for (final productHTMLElement in productHTMLElements) {
    // scraping logic
    final url = productHTMLElement.querySelector('a')?.attributes['href'];
    final image = productHTMLElement.querySelector('img')?.attributes['src'];
    final name = productHTMLElement.querySelector('h2')?.text;
    final price = productHTMLElement.querySelector('span')?.text;

    // instantiate a Product object
    // and add it to the list
    final product = Product(url, image, name, price);
    products.add(product);
  }

  // print the scraped data
  for (final product in products) {
    print(product.url);
    print(product.image);
    print(product.name);
    print(product.price);
    print('');
  }
} 

Run it, and it'll return:

Output
https://scrapeme.live/shop/Bulbasaur/
https://scrapeme.live/wp-content/uploads/2018/08/001-350x350.png
Bulbasaur
£63.00

// omitted for brevity...

https://scrapeme.live/shop/Pidgey/
https://scrapeme.live/wp-content/uploads/2018/08/016-350x350.png
Pidgey
£159.00

There you go! The scraped objects match the products on the page and contain the desired data.

Step 4: Export Your Data to a CSV File

The most straightforward way to convert the collected data to CSV format is using csv. This package provides a comprehensive API to convert a list of strings to a CSV string and vice versa.

Install csv in your Dart project:

Terminal
dart pub add csv

Then, import it in the /lib/web_scraper.dart file. You'll also need to import the Dart io library:

web_scraper.dart
import 'package:csv/csv.dart' as csv;
import 'dart:io';

Transform each Product object in products to a list of strings. Pass it to the convert() method from ListToCsvConverter to get a CSV string. Create a products.csv file and populate it with this data with writeAsStringSync():

web_scraper.dart
// convert the scraped products to a
// list of list of strings
final List<List<String?>> productStrings = products
    .map((product) =>
        [product.url, product.image, product.name, product.price])
    .toList();
// append the header row
productStrings.insert(0, ['url', 'image', 'name', 'price']);

// convert to CSV format
final csvContent = const csv.ListToCsvConverter().convert(productStrings);

// export the CSV string to a file
final file = File('products.csv');
file.writeAsStringSync(csvContent);

Put it all together, and you'll get:

web_scraper.dart
import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as html_parser;
import 'package:csv/csv.dart' as csv;
import 'dart:io';

// representation of the product object to
// scrape from the page
class Product {
  String? url;
  String? image;
  String? name;
  String? price;

  Product(this.url, this.image, this.name, this.price);
}

Future scrape() async {
  // create a Uri object to the target page
  final pageUri = Uri.parse('https://scrapeme.live/shop/');

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  final html = response.body;
  // parse the HTML document
  final document = html_parser.parse(html);

  // where to store the scraped data
  final List<Product> products = [];

  // select the product HTML elements on the page
  final productHTMLElements = document.querySelectorAll('li.product');

  // iterate over the product nodes and apply
  // the scraping logic
  for (final productHTMLElement in productHTMLElements) {
    // scraping logic
    final url = productHTMLElement.querySelector('a')?.attributes['href'];
    final image = productHTMLElement.querySelector('img')?.attributes['src'];
    final name = productHTMLElement.querySelector('h2')?.text;
    final price = productHTMLElement.querySelector('span')?.text;

    // instantiate a Product object
    // and add it to the list
    final product = Product(url, image, name, price);
    products.add(product);
  }

  // convert the scraped products to a
  // list of list of strings
  final List<List<String?>> productStrings = products
      .map((product) =>
          [product.url, product.image, product.name, product.price])
      .toList();
  // append the header row
  productStrings.insert(0, ['url', 'image', 'name', 'price']);

  // convert to CSV format
  final csvContent = const csv.ListToCsvConverter().convert(productStrings);

  // export the CSV string to a file
  final file = File('products.csv');
  file.writeAsStringSync(csvContent);
}

Launch the Dart web scraping script:

Terminal
dart run

Wait for the script to complete, and a products.csv file will appear in the project's folder. Open it, and you'll see:

CSV File
Click to open the image in full screen

Et voilà! You’ve just performed web scraping in Dart.

Dart for Advanced Web Scraping

Now that you know the basics, you're ready to dive into more advanced web scraping Dart techniques.

How to Scrape Multiple Pages With Dart

The current CSV file contains 16 records corresponding with the products on the target site's home page.

To scrape all products on the site, you need to do web crawling, which means discovering web pages as you scrape data. Learn more in our guide on web crawling vs. web scraping.

The steps to implement web crawling are as follows:

  1. Visit a webpage.
  2. Discover new URLs from the pagination HTML links and add them to a queue.
  3. Repeat the cycle on a new page picked from the queue.

This loop stops when the Dart scraping script has visited all pagination pages on the site. As this is just a demo script, limit the pages to crawl to 5. This way, you can speed up the process and avoid making too many requests to the destination server.

You already know how to carry out step 1. You need to learn how to extract URLs from the pagination links. First, inspect these HTML elements on the page:

Inspect Multiple Elements
Click to open the image in full screen

Notice that you can select each pagination link with the following CSS selector:

web_scraper.dart
a.page-numbers

Adding those links to a queue without some extra logic isn't a good approach. The reason is that you don't want the script to visit the same page twice. To make the crawling logic more efficient, use these two additional data structures:

  • pagesDiscovered: A HashSet storing the URLs discovered by the crawling logic.
  • pagesToScrape: A Queue containing the URLs of the pages the script will visit soon.

Initialize both with the URL of the first product pagination page:

web_scraper.dart
final firstPageToScrape = 'https://scrapeme.live/shop/page/1/';

final pagesDiscovered = {firstPageToScrape};
final pagesToScrape = Queue<String>();
pagesToScrape.add(firstPageToScrape);

Then, use those data structures in a while loop to implement the Dart crawling logic:

/lib/web_scraper.dart
// counter for the current iteration
var visitedPages = 1;
// max number of pages to visit
final limit = 5;

// until there are no pages to scrape
//or the limit is hit
while (pagesToScrape.isNotEmpty && visitedPages <= limit) {
  // get the next page to scrape
  final currentPage = pagesToScrape.removeFirst();

  // transform the page URL string in a Uri
  final pageUri = Uri.parse(currentPage);

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  final html = response.body;
  // parse the HTML document
  final document = html_parser.parse(html);

  // select the pagination links
  final paginationHTMLElements = document.querySelectorAll('a.page-numbers');

  // logic to avoid visiting a page twice
  if (paginationHTMLElements.isNotEmpty) {
    for (final paginationHTMLElement in paginationHTMLElements) {
      // get the current pagination URL
      final newPaginationLink = paginationHTMLElement.attributes['href'];

      if (newPaginationLink != null) {
        // if the page discovered is new
        if (!pagesDiscovered.contains(newPaginationLink)) {
          // if the page discovered needs to be scraped
          if (!pagesToScrape.contains(newPaginationLink)) {
            pagesToScrape.add(newPaginationLink);
          }
          pagesDiscovered.add(newPaginationLink);
        }
      }
    }
  }

  // scraping logic...

  // increment the limit counter
  visitedPages++;
}

Integrate the above snippet into /lib/web_scraper.dart and you'll get:

web_scraper.dart
import 'dart:collection';

import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as html_parser;
import 'package:csv/csv.dart' as csv;
import 'dart:io';

// representation of the product object to
// scrape from the page
class Product {
  String? url;
  String? image;
  String? name;
  String? price;

  Product(this.url, this.image, this.name, this.price);
}

Future scrape() async {
  // where to store the scraped data
  final List<Product> products = [];

  // the URL of the first page
  // to scrape data from
  final firstPageToScrape = 'https://scrapeme.live/shop/page/1/';

  // data structures for web scraping
  final pagesDiscovered = {firstPageToScrape};
  final pagesToScrape = Queue<String>();
  pagesToScrape.add(firstPageToScrape);

  // counter for the current iteration
  var visitedPages = 1;
  // max number of pages to visit
  final limit = 5;

  // until there are no pages to scrape
  //or the limit is hit
  while (pagesToScrape.isNotEmpty && visitedPages <= limit) {
    // get the next page to scrape
    final currentPage = pagesToScrape.removeFirst();

    // transform the page URL string in a Uri
    final pageUri = Uri.parse(currentPage);

    // perform a GET request to the target page
    var response = await http.get(pageUri);

    // retrieve the HTML from the server
    final html = response.body;
    // parse the HTML document
    final document = html_parser.parse(html);

    // select the pagination links
    final paginationHTMLElements = document.querySelectorAll('a.page-numbers');

    // logic to avoid visiting a page twice
    if (paginationHTMLElements.isNotEmpty) {
      for (final paginationHTMLElement in paginationHTMLElements) {
        // get the current pagination URL
        final newPaginationLink = paginationHTMLElement.attributes['href'];

        if (newPaginationLink != null) {
          // if the page discovered is new
          if (!pagesDiscovered.contains(newPaginationLink)) {
            // if the page discovered needs to be scraped
            if (!pagesToScrape.contains(newPaginationLink)) {
              pagesToScrape.add(newPaginationLink);
            }
            pagesDiscovered.add(newPaginationLink);
          }
        }
      }
    }
    
    // select the product HTML elements on the page
    final productHTMLElements = document.querySelectorAll('li.product');

    // iterate over the product nodes and apply
    // the scraping logic
    for (final productHTMLElement in productHTMLElements) {
      // scraping logic
      final url = productHTMLElement.querySelector('a')?.attributes['href'];
      final image = productHTMLElement.querySelector('img')?.attributes['src'];
      final name = productHTMLElement.querySelector('h2')?.text;
      final price = productHTMLElement.querySelector('span')?.text;

      // instantiate a Product object
      // and add it to the list
      final product = Product(url, image, name, price);
      products.add(product);
    }

    // increment the limit counter
    visitedPages++;
  }

  // convert the scraped products to a
  // list of list of strings
  final List<List<String?>> productStrings = products
      .map((product) =>
          [product.url, product.image, product.name, product.price])
      .toList();
  // append the header row
  productStrings.insert(0, ['url', 'image', 'name', 'price']);

  // convert to CSV format
  final csvContent = const csv.ListToCsvConverter().convert(productStrings);

  // export the CSV string to a file
  final file = File('products.csv');
  file.writeAsStringSync(csvContent);
}

Now, run the Dart web scraping script again:

Terminal
dart run

This time, the scraper will scrape data from 5 different product pagination pages. The new CSV file will contain more than 16 records:

Updated CSV
Click to open the image in full screen

Congrats! You’ve just learned how to perform web crawling and web scraping in Dart!

Avoid Getting Blocked When Scraping With Dart

Everyone knows how valuable data is, even if it's publicly available on a website. No one wants to give it away for free, and that's why anti-bot technologies have become so popular. The goal of these systems is to detect and block automated scripts, such as yours.

These solutions pose the biggest challenge to web scraping with Dart. However, the right solutions will let you scrape without getting blocked.

Two ways of eluding less sophisticated anti-bots are:

  1. Setting a real User Agent header.
  2. Using a proxy to hide your IP.

Follow the instructions below to integrate them into your web-scraping Dart script.

Proxy integration in http isn’t possible. You must use the HttpClient class from dart:io and pass it to http's IOClient.

Get the User Agent of a real browser and the URL of a proxy server from a site such as Free Proxy List. Then, use it when targeting your target sites protected with anti-bot solutions:

web_scraper.dart
import 'package:http/io_client.dart' as http;
import 'dart:io';

Future scrape() async {
  // initialize a Dart IO HTTP client
  HttpClient httpClient = HttpClient();

  // configure the proxy server
  String proxy = '204.12.6.21:4311';
  httpClient.findProxy = (uri) {
    return 'PROXY $proxy;';
  };

  // avoid SSL certificate errors
  httpClient.badCertificateCallback =
      (X509Certificate cert, String host, int port) => true;

  // use the Dart IO HTTP client to
  // initialize a package:http client
  final proxyHttpClient = http.IOClient(httpClient);

  // custom User-Agent value
  final userAgentString = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36';

  // perform the GET request through the proxy
  var response = await proxyHttpClient.get(
      Uri.parse('https://your-target-site.com'),
      headers: {'User-Agent': userAgentString}
  );

  // retrieve the HTML from the server
  // response and print it
  final html = response.body;
  print(html);
}

Those two tips are great for bypassing simple bypass anti-bot measures. But what about advanced solutions such as Cloudflare?

Unfortunately, a complete WAF like that can still easily detect your Dart web scraping script as a bot. Verify that by running the above script against this Cloudflare-protected page:

Example
https://www.g2.com/products/notion/reviews

This time, the result will be the following 403 Forbidden page:

Output
<!DOCTYPE html>
<html class="no-js" lang="en-US">
<head>
<title>Attention Required! | Cloudflare</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge" />
<meta name="robots" content="noindex, nofollow" />
<!-- omitted for brevity -->

Congrats! You’ve just learned how to perform web crawling and web scraping in Dart! Should you give up? Not at all! What you need is a web scraping API, such as ZenRows, which supports User Agent and IP rotation and comes with the best anti-bot toolkit.

Let’s see how you can boost your Dart scraping script with ZenRows. Sign up for free to redeem your first 1,000 credits and reach the Request Builder page:

building a scraper with zenrows
Click to open the image in full screen

Use the G2.com page mentioned earlier as the target site:

  1. Paste the target URL (https://www.g2.com/products/notion/reviews) into the "URL to Scrape" input.
  2. Enable the "JS Rendering" mode (User Agent rotation and the AI-powered anti-bot toolkit are always included by default).
  3. Toggle the "Premium Proxy" check to get rotating IPs.
  4. Select “cURL” and then the “API” mode to get the ZenRows API URL to call in your script.

Pass the generated URL to the get() method:

web_scraper.dart
import 'package:http/http.dart' as http;

Future scrape() async {
  // create a Uri object to the target page
  final pageUri = Uri.parse('https://api.zenrows.com/v1/?apikey=<YOUR_ZENROWS_API_KEY>&url=https%3A%2F%2Fwww.g2.com%2Fproducts%2Fnotion%2Freviews&js_render=true&premium_proxy=true');

  // perform a GET request to the target page
  var response = await http.get(pageUri);

  // retrieve the HTML from the server
  // response and print it
  final html = response.body;
  print(html);
}

Run the script. This time, it'll return the HTML associated with the target G2 page as desired:

Output
<!DOCTYPE html>
<head>
  <meta charset="utf-8" />
  <link href="https://www.g2.com/assets/favicon-fdacc4208a68e8ae57a80bf869d155829f2400fa7dd128b9c9e60f07795c4915.ico" rel="shortcut icon" type="image/x-icon" />
  <title>Notion Reviews 2024: Details, Pricing, &amp; Features | G2</title>
  <!-- omitted for brevity ... -->

Bye-bye, 403 errors. That’s how easy it is to use ZenRows for web scraping in Dart.

How to Use a Headless Browser With Dart

The Dart html package is an HTML parser and can only deal with static content web pages. If your target site uses JavaScript to dynamically load data or render data, you need another solution.

In particular, you must use a tool that can render pages in a controllable browser. The most popular and used headless browser library in Dart is puppeteer, a port of the Puppeteer Node.JS library. For a complete tutorial, read the guide to Puppeteer web scraping.

To showcase Puppeteer's capabilities in Dart better, we need a new target page. Let’s use a page that requires JavaScript execution, such as the Infinite Scrolling demo. It dynamically loads new products as the user scrolls down:

Infinite Scroll Elements
Click to open the image in full screen

Add puppeteer to your project's dependencies with this command:

Terminal
dart pub add puppeteer 

Next, use it to scrape data from a dynamic content page:

web_scraper.dart
import 'package:puppeteer/puppeteer.dart';

Future<void> scrape() async {
  // open a new page in the controlled browser
  var browser = await puppeteer.launch();
  var page = await browser.newPage();

  // visit the target page
  await page.goto('https://scrapingclub.com/exercise/list_infinite_scroll/');

  // select all product HTML elements
  var productElements = await page.$$('.post');

  // iterate over them and extract the desired data
  for (var productElement in productElements) {
    // select the name and price elements
    var nameElement = await productElement.$('h4');
    var priceElement = await productElement.$('h5');

    // extract their data
    var name = (await nameElement.evaluate('e => e.textContent')).toString().trim();
    var price = (await priceElement.evaluate('e => e.textContent')).toString().trim();

    // print it
    print(name);
    print(price);
    print('');
  }

  // release the browser resources
  await browser.close();
}

Run this script:

Terminal
dart run

It'll produce:

Output
Short Dress
$24.99

// omitted for brevity...

Fitted Dress
$34.99

Congratulations! You're now a Dart web scraping champion.

Conclusion

This step-by-step guide walked you through the process of web scraping in Dart. You’ve learned both the fundamentals and more complex aspects and techniques.

Dart is a rising language with a rich ecosystem of libraries for extracting data from the Web. The duo http and html enables you to do web scraping and crawling in Dart on static pages. Plus, you have access to puppeteer for dealing with sites using JavaScript.

The main challenge? No matter how good your Dart scraper is, anti-scraping systems can still stop it. Elude them all with ZenRows, a scraping API with the most effective built-in anti-bot bypass capabilities. Try ZenRows with a free trial today!

Ready to get started?

Up to 1,000 URLs for free are waiting for you