Getting started with Puppeteer

Getting started with Puppeteer

Puppeteer may currently be the most known headless browser automation library out there. It provides a high-level Node.js API which allows you to spin up and send commands to a Chromium or Chrome browser instance. It has proven itself to be easy to install, simple to use and performant by nature.

Some Backstory 📖

The way that Puppeteer works is that it provides a thin layer above the DevTools Protocol.

The DevTools Protocol is what gives you the power to do all the cool stuff in the actual “Inspect Element” toolbar in your browser. Actually this protocol is the same that powers up most Blink-based browsers (Chrome, Chromium etc.) providing the tools for DOM inspection, network profiling, debugging and all the other cool capabilities we have access to. In Puppeteer you can do almost anything you can do in the actual browser without hacks included.

Puppeteer belongs under the Google Chrome umbrella and specifically is maintained by the Chrome DevTools team. That fact alone should give you some confidence about the long-term sustainability of the project. Additionally it is guaranteed to be up to date with the latest features that are shipped in the Chromium/Chrome browsers. You will not usually have to wait about a feature being ported to the library.

So let’s get to it!👷

Get The Library

Initially make sure you are in a machine with Node.js >=v10.18.1 installed so we can go with the latest Puppeteer version.

Make a new project folder called puppeteer-example so we can start going through the process.

mkdir puppeteer-example
cd puppeteer-example

Now we can go ahead and bootstrap the required Node.js setup.

npm init -y

With this you are ready to install your favorite libraries like left-pad or browser-redirect but you can skip it for now 😂. Back to our target:

npm install puppeteer@3

While installing the library, you probably came across a message on your console stating Downloading Chromium xxx. That message is there to let you know that with the Puppeteer library, a specific version of Chromium for your operating system is also downloaded (_inside nodemodules) to be used by your installation of Puppeteer. The reason for that is every Puppeteer version is only guaranteed to work with a specific Chromium version it comes bundled with. _Special Hint: If you are a bit disk-space constrained, delete your nodemodules directory from your test or unnused Puppeteer projects after you are done.

First Encounter🤞

We got through the installation and now we can start writting some code. You will probably be surprised with how much you can do with a few lines of code.

For our first task, we will try to explore the official Puppeteer website https://pptr.dev/. Create a test file index.js with the following contents:

(async function () {
  const browser = await puppeteer.launch({ headless: false }); // We use this option to go into non-headless mode
  const page = await browser.newPage(); // Create a new page instance
  await page.goto("https://pptr.dev"); // Navigate to the pptr.dev website
  await page.waitFor(5000); // Wait for 5 seconds to see the beautiful site
  await browser.close(); // Close the browser
})();

Now by running this code using node test.js you will witness a Chromium instance launching and navigating to the pptr.dev website for 5 seconds before closing down.

I am sure that this now feels a comfortable place for web automation enthusiasts. The only component missing is the scenarios you need to run and getting the feel for the intuitive and simple API that Puppeteer advertises.

Why not take a look ?

Exploring a Simple Scenario 🕵

Skipping the pleasantries, our aim will be to explore the autocomplete search functionality that pptr.dev website has for our convenience.

Thinking Out Loud

So let us go about describing what does an actual user needs to do to get this autocomplete feature to achieve its purpose.

We expect the user to: 1. Open the page 2. Try to find the autocomplete search 3. Type his query for the API method he is looking for 4. Click the most relevant result on the list 5. Expect to see the section with the item he selected

To test out if the Puppeteer API is as intuitive as it claims to be, we can go ahead and translate this thinking to Puppeteer commands.

/* Somewhere else... */
const Homepage = {
  autocompleteSearchInput: "input[type='search']",
};
const apiSearchTerm = "metrics"; // The API method we are looking for
/* ... */
await page.goto("https://pptr.dev");
await page.waitForSelector(Homepage.autocompleteSearchInput);
await page.type(Homepage.autocompleteSearchInput, apiSearchTerm);
await page.click("search-item"); 

// Find the API name using XPath
const $apiMethod = await page.$x("//api-method-name[text()='" + apiSearchTerm + "']")[0];

// Check if this method name section is actually visible on the viewport
const isApiMethodVisible = await $apiMethod.isIntersectingViewport();
assert.equal(isApiMethodVisible, true);

Well that was it! 🎉 The code above, containing also some housekeeping, in my eyes seems pretty straightforward based on the thinking process we laid out, I do not think I even need to explain what most of the commands contribute to. The API successfully translates to clear language without relying on other external abstractions.

A point that we can stand on a bit is the combination of commands that are used to detect if the API method that we were looking for is actually inside the browser viewport. People with experience in the field know that to assert this fact you would either create your own custom command (doing viewport dimension calculations) or rely on a framework command that has already been implemented for us.

The differentiating factor here is that the command we get directly from Puppeteer could be considered the most reliable, just from the fact that it is provided by the platform itself.

One or Two Things Missing 🙈

After we all agree that the API is rather intuitive and simple to use, we can go over and mention a couple of things that might seem to be “missing” in making our development experience a tad much better.

1) Filling your code with the async keyword

As you have definitely observed, there is this async keyword you have to sprinkle all around your code, and it feels a bit noisy for me at least. This keyword is required because of the event-driven nature of the browser APIs. The way to code around asynchronous and event-driven platforms in JavaScript is by using Promises to model your operations, and Puppeteer has done just that.

To make handling of those asynchronous operations a bit less painful, JavaScript has added some new keywords to the language syntax. These keywords are the async & await that you see on our code. Because Puppeteer’s API needs to use Promises, the best way we can write our code is to use this async/await syntax for most commands.

2) No chaining available yet

Due to some design decisions and the nature of the library, as we have mentioned in the point above, there is currently no support for what we can call method chainning. With this capability our code could become so much more fluent to read and follow through. Picture something like:

await page.$("input[type='search']").click().type("metrics").submit();

I cannot vouch for but I think there are some third-party library solutions you can try. If you want to go a bit over the state and the possible external solutions, you start by taking a look at one relevant GitHub issue.

Closing

You just got through a super fast introduction on how to setup Puppeteer and code a simple scenario for an autocomplete search. From here on out you are on your own, except for all the recipes that will come on The Home of Web Automation.

My suggestion would be to start experimenting on your own use case and as a bedtime story, go over the detailed API documentation on GitHub. It is almost certain you will find a couple of surprising things you did not expect to do using the native commands.

Cross posted from The Home of Web Automation

Photo from Kevin Ku at Pexels