Glen Shires is a software engineer who leads the development of the Google Assistant SDK. Previously he led development of the Google Cloud Speech API and the Chrome Web Speech API. He also chaired the W3C Speech API group and has been granted 15 patents. Prior to joining Google in 2010, he led development teams for voice and speech products at Intel, Picazo and General Magic. He earned a B.S. and M.S. in Electrical and Computer Engineering from the University of Wisconsin - Madison.View the profile
About the talk
The Google Assistant SDK for devices lets developers embed the Google Assistant into their custom hardware. Come learn more about all the new features that have been added to the SDK so developers can create anything from a fun weekend project to state-of-the-art commercial hardware.
Anchors Ramsdell lead product manager for the Google Assistant SDK for devices to go lead for Google Assistant SDK. Epicor went to talk to you about our software development kit that allows you to invade the Google Assistant into Hardware devices and how to use that SDK to extend the assistant to work with the features and functionality of your devices. So just the kind of calibrate a little bit last year Google IO. We launched the initial version of our Google Assistant SDK for devices that does just what I said allows you to embed
the assistant experiences in the devices rebuilding whether you're prototyper a maker or commercial OEM. This is the technology used to actually get those features and functionality into your Hardware. It's kind of frame the conversation a little bit so you can kind of get like cuz it's an ecosystem and there's a platform and there's things involve. So let's talk a little bit before we go deeper into how this relates to the other pieces of technology that we're building is at the core in the middle. Here is the assistant service. This is the artificial intelligence and machine
learning and AI in the MLL. It's actually powering the virtual assistant minutes inside your home. Hopefully, if you have a Google home or Google mini or a JBL speaker, whatever it might be that's at the core that we've been building for years and years and years. That's basically the evolution of search and other functionality was in building a Google then we talked about ways of extending it. Right. So one way of extending is bringing the third-party cloud services to the assistant. So that's actions on Google and there's several sections about not going on throughout Ayo, and that's
how you bring those services to the assistant. So you can do things like order a pizza book a ride known and how you can extend the things that we're not necessarily doing so like we might provide Khalid Ngoi things like that, but other people providing like those Services again ordering a pizza ordering a ride, then the other way of extending your system is bringing out of system experience to Hardware devices, right and that's where we come in at work to talk about today. That's using the Google Assistant SDK for devices. So kind of again in framing what's talk a little bit
about what people are using this SDK for so we'll talk first more about a commercial Williams. So recently Consumer Electronics Show in January, when is integration with the LG TV, so we brought the assistant to the experience inside the TV little you're talking to the the push button on the remote invoke the assistant on the TV and then helped you out with everything from like what's my day look like to finding media inside the TV deeper into what the SDK is the TD was actually using our cloud-based API and then what LG did was wrote A Very Thin
Client on top of webos webos. Is there operating system that's running on those TVs and they were able to do an over-the-air fota update to get onto the TVs to reach millions of subscribers and millions of end users of their TVs. So it was nice because it gave him a abilities take new technology the Google Assistant, but actually work with in market devices that they already had. Then we also announced back in February intergration with a nest camera on the nest cam IQ. What's nice about this is that two days one The Nest cams we were finding is that households that have bought a camera
actually end up buying multiple cameras and what that does that starts to give you kind of this ubiquitous experience in your house where you're just walking through you asked for some help and your virtual assistant actually help you out wherever you are when you when you need that help so it starts to push us into the direction of dyke 00 going from my living room has like a Google home or a JBL speaker or something like that to I actually have the assistant kind of true throughout my house. Soldier commercial oem's, we've also had some fun experiments with our friends over at
deeplocal some local what is a creative agency out of Pittsburgh that we've worked with several times over the past year-and-a-half to build some really cool experiments. So if you were out at Google IO last year, maybe you saw the mocktail mixer that was out at the front of Ayo, maybe you saw it on YouTube then in our October Hardware event, they built a pop-up donut shop, which is really cool. What do you need to walk up to it and engage with the assistant and it would give you Donuts but then at random it would give you actually a Google mini which kind of looks like a donut without a
hole in the middle and then they brought that throughout many states in America at the Consumer Electronics Show in January. They did a giant gumball machine, which was a huge hit. I think at one point in time. There's about a 2-hour long line of people coming up and actually interacting with his thing and then over in the developer sandbox. The assistant developer is in a box with a poster maker that we can go and interact with the assistant and generate a unique poster that you can then take home with you. so not commercially licensed but still a lot of fun and it's really fun to innovate
with with that team is also and it really funny innovate with The larger maker Community which has been awesome. Like this has been a very long tail of developers. Just taking her a software development kit and using it weighs that we would have never thought of so I just highlighted the YouTube links actually if you want to check them out, but we got people building retro Google home devices that are kind of beautiful. We had a candy dispenser. We've had a ton of robots actually which one to talk a little bit about as well. And then we had one maker actually in bed the assistant
inside of a Mac OS to bring it to the actual laptop experience. So a lot of fun there. That's a bit of an overview and Connor framing but I also want to just jump right into a demo to let you see what actually is happening and let Glen take over and give you a demo and then Works scripture saying you can build the Google Assistant into all sorts of types of devices and there's also starter kits that you can get for example is for Android things and we also have one called The aiy Voice kit that's available in several retailers
and you'll see that you are all that you can see the retailers and get that at what is this cardboard box? If you look closely you can see two microphones. That's why I was of course a speaker. So you got the microphones up here on top and we've got the speaker here. I've got a big battery here. You can just plug it into the wall or whatever you'd like to do. So inside this cardboard and that's the whole thing comes inside. The cardboard is a rent a small little computer called. Raspberry Pi
how long is how long is the Golden State Bridge? Golden Gate Bridge has a length of 8981 ft. So it's doing better than I am in terms of speaking at resume. How long is that in meters? 1 foot equals 0.305 m inside an embedded device Turn on hot word. Accepting hot word. So now I don't have to trigger it and I can simply say Hey Google pick a random number from 1 to 100. Okay. 73 hey, Google. How tall is Mount Kilimanjaro 341 ft tall? Hey Google, turn off hot word. Sorry, I'm not sure how to help.
So let me show you how that works. So it's this type of box as I said is come from a kit with everything you need and it's off connect to the Google service via Wi-Fi just two different types of software. You can run us a c k actually supports two ways that you can run it Wanda is a way to run it on almost any platform any operating system any programming language? So it's just you run all your code directly on here and we have sample code that does exactly
that in this case. I'm actually running some python sample code on the box that implements the entire client in that python code. So you got the entire sample the other way, which is when you say, Hey, Google. Turn off hot word. And I've got pressing my luck I guess is the assistant runs directly on the client. So we provide the entire client Library which include things like the hotword detection. OK Google as well as Echo cancellation and another number of other nice things like timers and alarms that runs on either Linux or Android things.
Press Harley Davidson County little bit about when you're running the library is actually quite simple to use you can see their simple functions. You can call the started to turn the microphone on mute. You can also rather than starting it with a hot word. You can start the interaction by programmatically and then the next flight here shows you some of the events that come out on the following slide. We will see a lot of different events that your your code can handle if you'd like to handle it or there's no need to. I
don't think I mentioned there are two microphones and you may notice that also on Google home. There's only two microphones on Google home and a Google home does a wonderful job background noise with people speaking to it from a good distance away. And the way that we do that is technically called gyro beamforming what it does is it's very similar to the way people have two ears and they're very good at picking out speech out of noise. But we've done is we've used with machine learning and run this on the on the server to get a very robust
noise your boss fire field experience and what that means is the client side has minimal processing power so we can really keep the clients low-cost. Thank you. Chris Glenn was talking about let's just go in a little bit deeper into what the SDK actually provides. So at the highest level there is a cloud-based API. Grpc protocol so grpc is actually uses hdb to go back and forth to give you streaming support which is important when you're actually doing audio because you wanted to be fast low latency. The benefit of that API is that it's available available from
just about any platform. So again, just like the LG TV. Example that I gave they had webos, right? And we got a number of Partners of actually coming with her own platforms that are already out in the market and they want to know how to bring those platforms to the assistant and so we can create a very thin client that kinda municate to that cloud API and out-of-the-box. We provide a recorder called grpc bindings. Those are those thin clients that are built on python nodejs C plus plus and a roni to Linux for Android things running platform that you actually have Those that
if he is really good for push button push to talk support So when Glenn was with the box and he pushed the button that's actually invoking I think lying is talking to our API. If you want to have any experience more like that Nest cam where its hands off cuz those cameras are typically mounted above above you like and your ceilings and what not. We should call them far afield for hands-free experiences and I use Technologies like wake word or hot word is what you may may have heard. OK Google bit to get that experience. We have client libraries that are built for Linux in Pacific clinics
3.18 and above that give you that hot word support an echo cancellation on Glen as mentioned in samples and tools that allow you to embed debug the assistant and tested as well beyond that we have hard work. It's so flooded mention the the aiy kids over here. So aiy is a Twist on DIY so artificial intelligence done by you. And then there's also the what does at the IMX actually bring more and more chips Psalms to to the market for developers to get up and going and running with the assistant.
Set with all of that kind of framing this again is our goal is really to bring that ubiquitous experience to to everybody to and we're not going to build all the hardware out there nor we have been all the experiences. So we've done a fairly good job with speakers and whatnot. But there are appliances. There's Auto there's things that are in your bathroom. And so we are really trying like why Glenn and I get up in the morning is actually to come and figure out like what are those user experience? If you want to bring to Market with Partners right b a prototype car makers are commercial
Williams and then what technology do we have to actually have to build to make that happen? Let me think about it one way to to kind of categorize it is to think about your day and is a little bit trivial but it kind of gives you an idea that we're trying to get a holistic experience from streamlining your morning when you want to wake up and have your coffee made or you know, you want to stream NPR news to figure out what's going on. Or maybe you don't just read the news then when you've actually moved from your house to on-the-go. I forgot to actually set the security camera.
I left the garage open or if you're coming home from the grocery store. You want to preheat the oven to 350 cuz you have lasagna that you need to put in there when you get there and then finally helping you relax in the evening. So everything from hey, you know, I have kids and so no more screen time for kids to turn off the Wi-Fi in the kids room or turn off the Wi-Fi the house or you know dim the lights because we actually went to watch a movie or watch TV, but you know, just trying to figure out what those user experiences are that actually add value to you and then spend figuring out
the technology behind it. So when it comes to actually integration pads for doing it, so if you're actually building Hardware, there's to pass that we have coming into the assistant and it's a little too marketing-speak we call works with us and then assistant built-in is if if anybody has like a Philips hue light or Nest Thermostat, those are worked with devices so they can be controlled by any other device that has assistant embedded inside of it. That is if you want more information on that tomorrow at 11:30 on stage 5, they're going to talk in a bit more about
how it works with can how you can integrate with works with 20. You're talking about built into the second part where you're actually taking interested in bedding and Hardware. So it's kind of a controller versus control. Your building a controller is a device that can actually control other devices as well and interact with the assistance service for knowledge. And things like that. Looks like a lot about developer benefits of the assistant in the assistant SDK. First of all, I'm minimum Hardware requirements. So as I mentioned if you're doing like push-to-talk scenarios
and you want to integrate with our Cloud API, there's very little that's that's needed on the on the clients by the fact. It's all up to you with whatever you're running on your on your client. You can keep running it. It's the effect of making a simple like rest call to our service to integrate. Beyond that though if you actually when I enter great and have hotword detection Echo cancellation of things like that. So you can have that hands-free experience. Then we still have minimum Hardware requirements as Blended mention. We don't require a massive microwave to Mike's and
you're good to go. We can actually use neurobion for me to go to figure out what you're saying and need to do proper hotword detection and grammar detection from a ram perspective. It's only 256 makes the ram requirement device to get up and running and then one core arm V7 processor to get up and running. So we're truly trying to shrink that down and then over time will start looking at things like our talks and microcontrollers as we move into the appliance place. We have built-in power support. So you don't have to provide your own hot word model me like that. You simply download our
library put on an embedded Linux base device and you're Off to the Races. You got OK Google and everything will pick up and I had mentioned in children code will take care of the rest of the library will take care of bringing any audio transmitting it to us in real time and then streaming the back down the response. Google is a global company. We know that we need to continue to flush out our language in lolcow story. We've done a great job since last year moving into 14 different languages and look how so you can see up in
this wide right here, but we want to see over time to expand this map to actually get into other countries because we know that people that are building with again whether you're a prototype her or make her or commercial yam. You need to meet your customers were they actually are aware and yours is actually are and so we're going to continue put momentum behind this. In terms of actually when you're a commercial OEM I wanted to look like as we've learned over the past 12 months working with LG and working with Nest how to go from prototype to
commercialization. So if you're in that space and you are trying to put a portion of ice, I wanted to give you a kind of insight into how it's working right now. We're still early stages working with a few commercial Williams. Our goal is actually to be more immersive and go deep with them to figure out what it was right experiences for their end users who can build the foundation on which we can start building more voice technology on top of and so kind of path here is I'm handling a few details but you start prototyping using assistant SDK to build an idea to build a concept you submit that
to us for review and we'll iterate on the device itself how it fits into the larger ecosystem. What are the end-user experience of it? You're trying to bring the market if it's all good. We end of assigning an actual account manager a technical account manager to you to help you facilitate that and move forward. Then beyond that you go into certification both in terms of like okay is the voice recognition actually working on the device. Does the marketing guy doesn't even marketing guidelines is the brand incorrect. Are we all in good shape and then step 5 launch have a party and be good to go
this kind of the path that were taken right now and over the course of the next few months, but we're going to we're going to focus on and then we looking towards really scaling it up in 2019. Text Michael Stephens Funeral teachers and where to go back and forth a little bit on demos. I think so since last year, we've been hard at work and we've added a couple of things. First of all you got is visualization support to the SDK. So now you can actually I enable your device to say it's it's capable of handling is a display enable device. We can handle visualisations you
get things back like knowledge query Sports and Scores weather personal photos. Let Glen the first off. I want to show that we have a new developer tool what you're saying on screen right? There is is my Chromebook and with this rather than using an embedded device. You can actually use your laptop with chrome to test out the SEK and to get your application running and test out different parameters. For example, we support is Chris was mentioning several
different languages and you can set the parameters and then just test it out. For example, I can say What's the weather in San Francisco? Currently in San Francisco. It's 62 degrees Fahrenheit in partly cloudy. Today it'll be cloudy with a forecasted high of 62 and a low of 53, but I could say those or if I just want to click on it. I can find out the weather for this weekend. In San Francisco Friday, it'll be mostly sunny with a high of 70 and a low of 59 degrees
Fahrenheit Saturday and Sunday. It'll be cloudy with lows in the mid 50's highs will be in the low 70s Saturday then be in the mid-sixties on Sunday. There we go. Of course. This can do things that Google home can do such as search I can say Who is Larry Page? According to Wikipedia Lawrence Edward page is an American Computer scientist and internet entrepreneur who co-founded Google with Sergey Brin. And because this is a developing pool tool I can also for example, look at the request I made this is actually a Json request that shows the different
parameters that I sent up and put it in addition to the audio and I can see the responses so they got back from the server. You can see the transcript as it was forming as I was speaking. It is showing the transcript later on it's showing the HTML coming out and you see the audio is actually being streamed back as well. Papillons are search results. I can also do personal results. For example. Show me my photos. This is what I found in your Google photos. Sorry, it wasn't on the right screen when I did that. Here's what I found
in your Google photos. So there we go. And of course I can scroll through see the different photos of a white water rafting trip. We did recently and zoom in full screen. Cancel that shows what we can do with this developer school as well as the the visual output to let me show you how that works. What we've done here, let me hurry up the slide to show you how this is working what this is doing is is using the service API. Yes. We said we can run on any code any platform. So this is running a
is building. So one of the things we lacked for a long time and in the SDK was the LED notification through really like have different service push out updates to devices. So in this case, I have a trivial example of hey Google ring, the dinner bell will ring a dinner bell to all of your devices to help us out with things like OTA updates so over the air updates when we're actually went to update a language package for example on a device is very important. We had that push notification for them. And so now we're happy to address to the SDK. Ross and making an Endeavor in and forays into
into music. So we're starting the news in podcast support. So now you can actually access those Newsfeed. So NPR news, for example for your favorite podcast Radiolab. I happen to be This American Life fan. And so now you can actually build third-party devices that have news and podcast at 4 built into them and I think Jen cleansing to show that to us up for you again. I'll be using the aiy cardboard box and I will simply say Play the news. Added bring sounds a file and
line from NPR news in Washington. I'm Windsor Johnston president. Everything we wanted to show is notifications one thing that you can do with Google home. And you can also do with other embedded devices is have one device talk to other devices. So for example, you could broadcast things or you can say something like Ring The Dinner Bell Okay broadcasting now, so it's broadcasting from one device to the other device. It's dinner time. And so if you want to call your kids for dinner all
the devices in your house can say it's time to come down exactly exactly. So let's show you how that works. Notifications doesn't necessarily have to be between two cardboard boxes or two embedded devices. It's actually be between two Google assistants that are logged into the same account. So I could actually use my phone to ring the Dinner Bell on an embedded device. And so that's a notification. Okay, so around at things here one of these features that I've been really excited about is that what we're calling device
actions. So when we initially launched the Google Assistant SDK the feedback from the committee was like this is great. This is awesome. I can build a Google home clone. Now. How do I make it to custom things and it was nice because like that part of the community just got it like they understood what we are doing and where we should take it as well. And so this was all right one of our answers to that request, which is it like, okay cool. That's let you embed the assistant and then let's do it. Let's let you extend it to control that device and so just brakes done in a two ways of
possibly doing this we call them built-in device actions and custom device actions. And so just bear with me for a second. I'll go a little deeper into these. So built-in actions are built on top of grammer's things. You can say to a device where Google curate says, so a lot of the home automation, right if you have a nest device or a Philips Hue or a WeMo or whatever maybe Turn on turn off turn down the temperature make it hotter. These are all grammer's again things. You can say to a device that we curate and not static or dynamic. We actually can change them
over time. We can internationalize them on your behalf. So if you're building a device and you can leverage our built-in actions and know that the grammars that we have their will continue to grow and it. I like to tell folks is we're done with home automation. Then we've ruled out to the UK. We didn't see nearly the traction and we saw in the US when it comes to lighting and we don't know why and it turns out that a lot of people in the UK. What's a pop on and pop off the light. Everybody but there was a there's a segment of the population Atwood and pop on and pop off was not something
that we had known about and so we're able to do our due diligence research and then change it and we change it on the back end and none of our lighting Partners had to do anything. It just magically started working for those UK customers. Should we have figured it out first and maybe but you know, that's debatable so that some of the benefits of actually going with the built-in route now all that said and while those devices in grammer's and traits of all the overtime again, we're not going to build every device. So we're not going to understand everything that you want to do on That
device and so far that we offer custom actions where you as the developer USA device manufacturer can provide the grammars and the commands mapping to us and see what you say is like these are the 10 things that people can say to this device model and these are already structured intense like actual command picture come back down to the device and then you do the device drivers on there to actually, you know, do a dance do the Macarena whatever it may be on your device, but it's kind of an escape hatch right now so that you can have the flexibility in customization that you need to
have when building on your Hardware. Thank you. Chris operates my Bluetooth and turn on this robot. and I'll ask my favorite favorite rary device. Connect to robot. Sorry, I can't help with that yet. new trailers Connect to robot. People who are natural body. Okay. Let me try this website. Okay. Let me let me show you how this works and then I'll give that a shot in just a second. What this is doing is we can switch to the slide. Thank you. This is you think I can the Google service but we are implementing a custom device action so I can say things like how can I talk to robot?
And what happens is the Google service will understand my speech and send back a command to this day. I Y box and the well at that point send a Bluetooth command will it receive Jason that I can parse and if it wants to connect to iRobot? It sends a Bluetooth command to connect to the robot using the same library that we've used in the past. We've added a little bit of coat on top of this to implement a custom actions. Connect to robot. I can just keep talking. I'll try it one more time. Okay? Connect to robot. Very nice.
Okay. Set color red. So I can set the Bluetooth commands to do that. Robots get up. So it's a self-balancing robot. Go forward. forward Turn left. Don't fall off the table turn, right? Robot 20, right and then we go so you can see that we can control a robot or we can deploy defensive actions. If you can build an appliance or anything where you can actually set your own grammar and then parsley commands and have them do whatever you'd like them to do. Awesome. Good job. 56
so yeah, so let me know you're very quickly. What were sending out by the terms of custom device actions. First of all, Define these using a lot of the tools that you use for regular actions regular assistant actions such as the actions on Google tools and also dialogflow what those will generate are in this case search Json file that you can install into your device. And when your device is talking directly to the assistant, it's not like you have to say open my robot app and tell it to turn right you can
simply say turn right. So the first thing you went to find is Be intense the grammar what you would say to make something happen in this case. For example, when I said set color red, here's the intent that would allow me to say set color red set color to red robot color red soon as a variety of different ways that you can say say things and then next slide you'll see what the response is the Fulfillment the text to speech. So in this case of saying setting robot led to red and then the execution where I actually can purchase these parameters
and I can see that I'm sending the color to Red. So that is the way that we'd Define custom voice actions. We simply parts that and then translate that into the Commands to control the robot. Cool. Thanks. All right, so that the slide of I'mma tell you what I just told you the recap but there's a we're striving again for that ubiquitous Addison experience in your life to help you out real bad and we know that it's fueled by it accessible to me feel by a healthy ecosystem of developers. That's all of you in this room and everybody's
watching on YouTube right now and our goal when denying in a team that's helping us out is to provide that software development kit those two technologies to help you build in a bad assistant in the hardware devices at your building. So is that I think if this works you might have one more. Dial trick up your sleeve. Yes, we do. Do a dance. Robot is getting down on the Dance Floor. I will. Thank you very much. Android
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.