Eric is a senior staff engineer at Google working with the Chrome team on web projects like Puppeteer, headless Chrome, Lighthouse, Polymer, and web components. He's the author of "Using the HTML5 Filesystem API," and has led frontend projects like the Google I/O web app, Google's Santa tracker, chromestatus.com, and html5rocks.com. Prior to Google, Eric worked as a software engineer at the University of Michigan where he designed rich web applications and APIs for the university's 19 libraries.View the profile
About the talk
The headless browser revolution has arrived! Headless browsers are powerful tools that all developers can adopt in their workflow. This session will showcase examples of the amazing things that Chrome can do without a UI: write programs to control the browser; test a site; automate UI tasks; integrate into a CI system; setup A/B perf monitoring; prerender a client-side app for SEO; and more. The focus will be on using Puppeteer, Google's Node library for controlling headless Chrome.
00:57 Headless Chrome
03:20 Headless unlocks…
08:36 Pro tip
11:40 Pre-rendering dynamic pages
19:35 Stylesheets vs inline critical styles
27:25 Test a Chrome extension
31:50 Puppeteer as a service
My name is Eric Wildman. I'm a web developer. But basically I'm an engineer that works in the Chrome team. But I also work and develop relations, which means I hope you guys want developers build kind of the latest greatest when experiences adopt new apis and lately. I've been focused on testing automation headless Chrome and Puppeteer. I think it's a really exciting space fact that we have headless Chrome now Puppeteer. So feel free to hit me up with questions at e bite on Twitter if you want to talk to me after the other presentation, so it's really important for me to get us out of this
talk is not about testing. I think you should all test your apps. Don't get me wrong. It's very important. You can certainly use headless Chrome to do end-to-end testing smoke test DUI test whatever but I want to stick the other side of things to come to automation side of things. So this is something that I realized a couple weeks ago. Headless Chrome can really be a front end for your web app front end and this is kind of an aha moment for me, double rainbow moment. Once I started working with headless Chrome want to start a bacon into my account of workflow the developer and
actually makes my life. Can order me things I can put headless Chrome on a server and do really interesting things with that. We'll talk a little bit how to do that. That's really cool and Powerful things you can do with headless Chrome from is get that kind of nomenclature. Either way. We'll introduce Puppeteer, which is the no library that we built to work with headless Chrome and along the way what kind of just see 10 interesting use case driven demo that I've built like I want to share with you guys and we'll talk about puppeteers API to do some of those things. So that's today's
agenda. This is something that I'm going to refer to a couple times throughout the day presentation. This is the Pyramid of Puppeteer. It's basically the architecture of where all these things kind of fit together. So at the very bottom is headless Chrome, this is just a browser and so normally right when you click on Chrome there's this window that launch is the user can input a URL. There's a UI menu the pages interactive so you can type in the page can click around you can even open the dev tools and tweaked Styles and kind of modified a page in real time and a Duck Tales, of course.
many many more features But what time is Chrome there's none of that. So something is happening. Right Chrome is running you can see it in the taskbar there, but there's literally no UI Chrome headless Chrome is Chrome without Chrome. So to speak its headless for its Chrome without you I so there's nothing to interact with how is this than useful to us and we'll talk about that. Do you want to launch Chrome and headless mode? It's just a one line command line flag. So Desu Desu headless launches Chrome without a DUI simple. So we need to combine this with something else which is the most
important flag and you can pass any port number you want here. But once you combine these two flags, this is going to open Chrome kind of in a special note is going to open this remote debugging port and then we can tap in the dent tools Pro to Mac programmatically using this remote event Parking Port and so that's where things get really awesome. So what is headless Chrome actually unlock Force One of the most exciting things I think is it Billy to kind of pasties latest and greatest web platform features things like es6 modules and service worker and streams and like all this goodness
is coming the way we can finally write a sand test those apps cuz we have this up-to-date rendering engine what kind of following us as the web of walls. Do you think that it unlocks is all of this really awesome functionality that you guys were used to using the Deadpool sings Life Network throttling device emulation and code covered always really really powerful features. We cannot tap into that stuff programmatically, right automation scripts and test these things and then leverage some of those that work has been done for us in the past. So how was, has a lot of interesting things
that you can do and I do encourage you to check out this article. This is about a-year-old at this point, but it's a really good article. I wrote a little bit ago and it still is relevant. So you can do really interesting things to tell us Chrome without ever having to write any code which is kind of cool. She can watch it for the command line. You can take screenshots from the command line. You can print to a PDF just create a PDF of the page and do some other interesting things. So if you wanted to know just more about how the scrum Show me that's all they had was crowned Miss a Thing has
browser a thing. So what can you actually do with this stuff? Well, let's go back to the Pyramid of Puppeteer. So he's got the browser pictures all the es6 stuff. All that is at the bottom level on top of that is the Chrome devtools protocol. So there's a huge fan of a layer hear that the Chrome devtools itself uses to communicate with your page and change the page the whole API service that we can tap into. These are kind of like the yin and the Yang for each other. So I have to rank these as one of the greatest Duos of all time headless Chrome and you can really take an awesome
adventures. Of course you got Han and Chewie got PB&J you got Sonic and Tails, but headless Chrome and I've told awesome awesome duo. Susana field itself is pretty straightforward. It is a complex is a lot you can do with it, but it's basically just a Jason based website at API. So if you notice I open a web socket to localhost 9222, which is that remote debugging Port that you saw in the previous couple slides and then you can just bet you do message passing in this case in this example here. I'm basically getting the pages title. I'm just evaluating this document. Title expression inside
of the page using the runtime evaluate dentals flag in the protocol monitor panel and deductibles. You can see these requests in these responses kind of fly by so like anytime you tweaked a style or do something to death tolls. You actually see the traffic for it so you can kind of learn the API as you see this stuff happened. Go back to the pyramid Puppeteer. We got the browser. We got Dental sparaco all this cool stuff for going to tap into and on top of that is where Puppeteer comes in. The Puppeteer is a library that we launched last year were right around the time.
Headless Chrome came out in Chrome 59, you can get it off at 10 p.m. And the reason we created it was there wasn't a lot of good options for working with headless Chrome at that point in time. And we wanted to sort of highlight the dead cells protocol. Make sure people know how to use the protocol kind of making a high-level API for some of the really powerful things you can do so, we actually use a lot of modern features. You can see a lot of East Lincoln Way and Promises in my coat samples today, and that's because of this async nature of everything happening with websockets and no talking
to Chrome and all that stuff is a synchronous. So promises lend themselves very nice to do that. You can use no sex. So if you're not in a later version know you can pull use Puppeteer don't have to transfer file or anything like that. He wanted to create a zero configuration kind of setup for you. So when you pull down Puppeteer from 10 p.m. We actually download chromium with Puppeteer and it's because it's kind of hard to launch Chrome and find it install it on like a CI system. There's just a lot of issues. Sometimes she want to make it easy just bundle version of Chrome is guaranteed
to work with the version of Puppeteer that you guys install. High-level, if you guys will see a bunch of examples of that and create a canonical reference for the Deadpool protocol. And so that's why we crave Puppeteer. So let's look at a little bit of code a little Pump It Up of one of the most common things people do is just take a screenshot of a webpage. So in order to do that will call Puppeteer launch, and this is a promise. It's going to return a browser instance that we can interact with headless chrome. Chrome launched got a browser instant, and then we'll just create a new
page. And so this is going to open just a new tab in Chrome. You're not going to see it cuz it's headless Chrome, but it's opening about blank. And once that promise results we can navigate to the URL that want to take a screenshot of call Paige. Go to And I can actually wait for the pages load event to fire before it resolves. And then we can just use puppeteers API to take a screenshot and it has a bunch of option to get to go full page screenshot or actually screenshot of portion of the page or even a Dom elements. You can see it's pretty easy. It's very high-level. You don't need to
deal with a buffer is a responses or anything like that. You just pass it to the file. You want to create and you get a PNG file? Am I at least you just close the browser out when your script is done? Clean up Chrome metal shut down Chrome so all in all right, it's like four or five lines of code to do all this stuff launch. Headless Chrome find it on Barry's platform open a new page navigate to page wait for it slowed event. Take a screenshot close the browser. So this is what I mean by the high-level apis hurry things, but it's very easy to accomplish puppeteers API The
protesters headless Chrome and this is what we use by default when you call Puppeteer launch, you're not going to see an actual browser window, but it's actually head fulcrum. So headless Chrome head fulcrum from if you include this flag The Headless false flag is going to actually launch Chrome you going to see it and this is really handy if you're dividing scripts and like you have no idea what's going on. You can't see anything throw this actually see Puppeteer Clique around navigate Pages. That's kind of cool to see the stuff in real time. Headless Chrome at the bottom with all the web
to do this for you. I wanted to actually kind of put my words in my mouth isn't and building app and see if it was actually a viable solution. So I built this web Firehouse I call it's basically a Content aggregator for my team. We bringing all the blogs of a sample code everything we do it ends up here and it's a real app. It's a client-side app powered by is modules and some new stuff like fire store and Firebase off and kind of Json API as you can query the data its back end is written in node and runs Puppeteer and headless Chrome to actually do some server side rendering to get a good
good first meaningful paint. So this is the app must see how we built it has the main page of the app and it's a basic client-side Abbott got a container that gets filled with a list of Jason posts. Forgot my container here. I make a fetch request just get the list of Jason post and then call this magic render post method which renders the post into this container and all that thing with string and it literally just enter HTML is the content and see if you use the dime apis or whatever but that's alright is to take an empty page
basically just call that server side rendering method just load up the index. HTML file to client-side at will get the server-side render to pre-rendered version of that going to go through headless Chrome. Remember the browser and then just be sending HTML final response to the user and that's the server-side rendering using headless Chrome. She probably wondering is as fast as it was actually a viable solution. So I did do a little bit of measuring of this cuz I was also very curious to this. I'm sippin slowly your connection down to be like a mobile device Louis CPU down on a
the entire page to load. The only thing we care about is this this list of post right? We only care about that Mark up as headless Chrome Runners it Sophie go back to that server side rendering method. We're launching Chrome Roxy waiting for all Network request to be done. That's what that Network IO zero is we don't really care about like our analytics library to low don't care about images the load or other wasteful things we only care about when is that Mark available so we can change the wait until here to dump content load. I just want it immediately resolve this promise when my dog has
been loaded and we can talk on one more Puppeteer API, which is Paige. Wait for a selector and what this is going to do is going to wait for this element be in the Dom and visible. So we're waiting for that catering. Now this server-side render method that and Catering at 2 to ask but we're actually speeding up the the prewriting process by doing that. What's not waiting for the entire page to load number two is to cash pre-rendered results kind of an obvious one in the speeds up things quite a bit. So same method as before, but we'll just wrap it in a cash any time somebody comes
in for the first time will fire Pilots Chrome will do the free rendering and then store the results into cash and any subsequent request just gets her from that cash. It's in memory. So you would want to do something more persistent here, but this is just goes to show you that it's very easy only pay that penalty once for the free render. Number three is to prevent rehydration. So what the heck is rehydration, so if we go back to our main page, you have the container that gets kind of populated by the Json post. I think about what's happening here. Did you just going to visit this
page in the browser Chrome is going to do its thing. It's going to render this in the client. But headless Chrome is also doing that on the server. So it's kind of wasteful we're doing that twice. So I dealt with this was basically just look for this element that that gets server side rendering. I basically check and see if that post container gets added to the diamond if it's there at page load. I know that I've been service I rendered and I don't have to go through the hassle of kind of fetching the post and we're entering them again. So that's another optimization you can do. Number for
it's going to do is going to give us the way to intercept all Network request before Chrome ever makes them so we can listen for request events and inside of this I basically just set up a whitelist. If you're one of these requests like scripts RX HR is there such events that can generate markup will allow you to go through will continue the request. But if you don't your style sheet for instance, we'll just avoid the request. So this is another cool way on the Fly that were speeding up the free running process. Do you want to know more about pre-ordering and how does Chrome Puppeteer
all that good stuff. I just talked about there is an article that I wrote a couple weeks back. It's got more optimizations more discussion in there. Please give me your feedback. So things are really cool approached because I didn't have to change any of the code in the app. I actually just again tacked on hell is Chrome. I got a lot of stuff for free. So I'm curious to know you guys have spots. Number to awesome thing you can do with Puppeteer and headless Chrome is actually verify that lazy loading is paying off a lazy loading is a good thing. You should all do it. But sometimes, you know,
behind a user gesture. They have to click this navigation element. And that's the thing that actually dynamically load this bundle. And so you can use the script like this and combine the code coverage API to determine is lazy loading paying off to a b testing do some measurements and and use Puppeteer to your advantage there. Is a cool npm package worth checking out if you're familiar with Istanbul to generate these amazing HTML reports. You can basically get puppeteers code coverage and run this thing and basically get the same exact assemble a cheese, which is really nice.
So check that out number three is a b testing and there's that word testing again, but this is more of like live modifying your page without having to change the code of your page. So I want to measure if it's faster to inline Styles versus just having my style sheets be linked style sheets with the common thing people do is it going to pay off if I inline my Styles? And only what you would do is basically ship to different Register App and measure that you make code changes to measure that that we don't have to make code changes. We can live change the page on the Fly. Soyuz Network
interception again, but this time instead of listening for Network requests. Will listen for the responses. So for any stylesheet response, I get my check the resource type. I'm just going to stash the CSS text but the content of the files inside of a map for later. Will navigate to the URL everyone actually measure the sun just using page go to. I'm using a new method double dollar sign of vow. So this is kind of like a jQuery API where you can pass it a CSS selector. In this case. I'm grabbing all the style sheets on the page and my callback is going to get injected into the page.
It's not run inside of note. It's actually injected into the page. So in here you can actually run anything to browser supports right. I'm apis URL Constructor web platform features and what this code does it basically just replaces all the link tags with a style tag and injects the CSS content from the files inside of that style tag. So I'm just replacing the style sheets with the equivalent style tag on the Fly. And that's actually what gets served up. In this case. You can run it on a server you could do a script to do a side-by-side comparison and and we haven't changed the page to
do this. We would just use Puppeteer to live modify the request that are made. Stop doing a b testing. Number for is to catch potential issues with the Google crawler. So a couple weeks back I built this. That would fire hose app. And I realized after I pushed it to production and I hit the render is Google button on the webmaster tools that might happen. She doesn't render correctly in Google bought because it runs are super old version of chrome chrome 41 from 41 doesn't have the SS custom properties are all these cool new features. I was using so it's kind of host.
have to use all the apis that are get used all the CSS tickets used. And then you can correlate that was can I use data for Chrome 41. So that's what this group does that you can run this script on any URL and I'll tell you the features that you're using that aren't available in Chrome 41 Socrum status uses. What does it use web web components that uses CFS container that uses link while preload none of that stuff is available in the Google search bar. So this is kind of a cool early warning signal for you to determine if your app might not run correctly in Google search and so you can
make a load of polyfill where you next talk to load one before just by getting a list of features used by the page. Hammer V. Create custom PDFs a lot of people like to create PDFs other white pages. I don't really understand it. But a lot of people do we have an API for it. So if you joined us at the web sandbox over over here this year you can actually go up to the big light house and put a URL and what happens is Puppeteer spawns up three different tools. It runs webpagetest. It runs Lighthouse and it runs pagespeed insights all at once and then eventually what happens if you get this
over all kind of report this PDF of each one of those results from each of the tools, and we're just generating that PDF using headless Chrome and puppeteer. 38th Precinct 4 witches create a new page instead of navigating to a page. We're just going to construct one on the Fly. Just calling page. Set content. We're kind of building an HTML page is by giving it a string. I will set a viewport cuz he want the pace to be big. We don't want to be kind of a mobile size. So we'll use the viewport in emulation apis that is dead Souls has to create a big page. I didn't last but not least similar to
that uses the web speech synthesis API to read that text file back to us. So have a good example of combining note and Puppeteer with some of these newer web platform features. We can take advantage of Both Worlds. Hi, my name is Tapatia. I text the speak button on this page was not talking to you. I'm able to speak using the Brazos web speech API in the message injected into the page from those tldr. The rise of the Machines has become Chrome and that's because audio is not supported by headless Chrome. I have to use that headless false flag. In this case that I'm doing is amusing
page and what I'm doing here than just being silly around to sending a global variable in creating a global variable in the page called text to speech and just sending it to the content of that file that I read. So that's how the message gets into the page. And then what I do is I read the page just the HTML file instead of starting web server. I just kind of navigate to the day to URL version of its I'm just kind of on the Fly opening the page. And then the last thing I do is I click that speak button using page. Dollar sign again kind of a jQuery API or you give it a selector of an element
on the page and we just call the click my did not text with Kik softest this reading of a text. Number seven awesome thing you can do with headless Chrome is to test Chrome extensions. I don't know how people tested for tester Chrome extensions, but you could certainly tester Chrome extensions using Puppeteer, which is kind of cool, She got a real example. This is I'm going to run the lighthouse Chrome extensions real unit tests. They decide to use Puppeteer because they would ship code every once in awhile and the extension would just break and we wanted to fix the I want to actually run
a test. This is going to do it's going to use Puppeteer to launch a tab was going to edit or go to Paul Irish his blog and we're going to actually been started and inside of the the butter bar at the top. You can see lighthouses to bugging this page and chrome is being automated by puppeteer. Christmas happenings lighthouses Runnings it Lighthouse does normally reload. The page gives you a report and eventually all the test pass which is really cool. I know it was a lot has very fast. How would we do that? How are they actually testing their
happened to have this method called run lighthouse in extension. That's actually what kicks soft actually running lighthouse inside the Chrome extension. But how to remove the Tesla Chrome extension you can do you can crawl a single page application while maybe you want to visualize your app. Maybe don't know all the URL is every single page app, maybe one Creative Suite D3 visualization of all the pages in your single page app. Maybe one credit site map on the Fly you can totally do that using Puppeteer and the apis that we have discover all the links on a page
just by using page Dillard double dollar sign of Val grab all the anchors on a page. And again, this is going to get run inside of the page and we just look for all the anchors are they the same origin is a page are they part of her ass and are they not Dapper actually viewing so we don't like render, you know ourselves. So we return the uniques that we'd run this recursively and then that's basically the way that I created that D3 visualization and you can do not just a list of things that you could do something like this with Santa Tracker is a very visual single page
application as you visit each link, you can take a screenshot or generate a PDF or what have you and then kind of visualize your app in a different way. Number 9 is one of my favorites verify that service workers actually cashing all of your app. Every time I use service worker. I always leave something out. I always forget to cash something into Cash Richmond. Somebody comes back to my page and ultimately my entire app doesn't work offline to get a 404 like an image is broken or something so we can verify that that's not going to be the case using some of
puppeteers apis. Cerritos will navigate to a page that uses offline cashing and then we call Page evaluate. We wait for the service worker in the page to be installed and ready. That's what this page evaluate method does. next time we do is we basically just look for all Network requests in a request that happens on a page will use a network interception will the stash the orrell to take the list of URLs that the the network is After that, we reload the page. We want to know what's coming from the network and what gets served from the service worker cash and we want to be able to determine
that so we just reload the page because at that point service workers been installed its cashed its resources and then we can check and see where things come from and that's what this last line does a biscuit Loop through all of those I request that get made by the page that checks their responses and determines if they come from the service worker or if they come from the network is an example of the script. If you run this, I know you were out you can basically see on Chrome status everything is cached which is great to get a little green check for all that you are all data
being returned by the service worker that are going to work offline and anything that doesn't like he's an ellipse request or have a little rent check. So that was a choice. I made an exception. Cash at least request, but you can see everything else gets cashed offline. So a school thing as many more things you can do with all this crap. The lascaux think I have time for is to procrastinate. So I didn't really have a good demo of the keyboard API Puppeteer and we have some stuff you can do a touch-up Malaysian this is going to do is basically just open the Google Pac-Man doodle
tools. The first one is Puppeteer as a service the notion that you can run headless Chrome kind of the browser as a web service. So I just put on the Handler is right in the cloud. So this first one here, you know, you passed with URL and it takes a screenshot and runs does all that stuff in the background so you can kind of think about baking headless Chrome into the browser into your your kind of web service. So that's probably as a service. We have a an awesome Google Chrome Labs Harbor pository has all those demos I showed you today as well as some other ones that we didn't talk about
lot of useful stuff there. Do you want to see anything else implemented? Let me know. I'll create a demo for it. This is a cool site that we took together to just try a puppeteer called try Puppeteer. Appspot.com and go in Kanika prototype ideas play with the code Runner official demos. See the results don't even have to install anything to kind of work with puppeteer. Play Gwar stuff. We covered a lot of things headless Chrome can do in a course. There's a bunch of stuff that we didn't cover that. It's also can do the things like server-side rendering freerunning your apps offline
testing off my verification a b testing making a Google search by happy creating PDF. You guys have realized it automation, you know is a thing. It's not just about testing rap sexy about making your add more productive and yourself the developer more productive The Headless Chrome is a front end for your front end Hub in Twitter. Do you want to converse with me after the show and I will be this one up here, which is got a great list of things. You can not take a screenshot of so, thanks for every sticking around. Really appreciate you coming.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.