Duration 40:29
16+
Play
Video

ML Kit: Machine Learning SDK for mobile developers

Wei Chai
Principal Engineer and Tech Lead Manager at Google
+ 2 speakers
  • Video
  • Table of contents
  • Video
2018 Google I/O
May 9, 2018, Mountain View, USA
2018 Google I/O
Video
ML Kit: Machine Learning SDK for mobile developers
Available
In cart
Free
Free
Free
Free
Free
Free
Add to favorites
39.69 K
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speakers

Wei Chai
Principal Engineer and Tech Lead Manager at Google
Brahim Elbouchikhi
Product Manager at Google
Sachin Kotwani
Lead Product Manager at Google

Wei is a senior staff engineer and engineering manager on Google Play and Android personalization team. Her main focus is personalized app recommendations for Play store and on-device machine learning (platform and applications) for Android. She is passionate about applying machine learning in real-world problems, no matter in the cloud or on a mobile phone. Wei joined Google eight years ago after several years as a researcher in speech recognition, financial time series and search ranking. She earned her P

View the profile

Brahim Elbouchikhi is a group product manager on the Android team. On Android, Brahim is responsible for developer and consumer facing machine learning products, including camera and developer SDKs. Prior to Android, Brahim led Daydream’s software team. Brahim was also a founding product manager of the Google Play store where he led monetization, search, and discovery. Brahim holds an MBA from the Stanford Graduate School of Business and a BS in Computer Science and Engineering from UCLA. Brahim has also wo

View the profile

Sachin is a product manager with a special passion for making software development easy and fun. He has worked on several teams at Google, including Google Cloud, Play, and now Firebase. Before joining product management he worked worked as a strategy & ops manager in Google’s Sales organization, and prior to Google, he worked in finance at Amazon. He holds an MBA from Carnegie Mellon University, and dual bachelor’s degrees in Business Management and Computer Science from the University of Missouri - Columb

View the profile

About the talk

Machine learning (ML) on Firebase allows you to harness the power of ML without needing to be an expert in it. Leverage powerful but simple-to-use image recognition capabilities across a set of on-device and Cloud-based APIs, or for the more adventurous, upload your own custom on-device model.

Share

Hello everyone. My name is brought me with you. I'm a product manager on MLK. We as a team and many of us were actually right here are very excited to tell you about ml kids. You've heard about it from Dave at the main Keynotes and you probably even looked at our documentation. But in this session will tell you a little bit more some of the behind-the-scenes stuff that we've been working on. So, let me get started. I think it's important to look back a couple of years and look at machine learning what's been happening the context and I try to quantify this the best way I could run away

is the most reliable source of data. And in this case you could see that over the past eight years. We had a 50 X increase in interest in deep learning, right and deep learning is not a new technology has been around since the 70s. What's changed is that we can finally going to deliver on the promise of deep learning? We can actually we have enough compute memory and Power on some of these devices to actually run those models that have been developed for quite some time. And so I wanted to give you a very simplified as a product manager a very simplified view

of machine learning a 101. And in this case you could see that especially deep learning uses these layers and try to mimic how the brain functions right and it tries to activate different neurons based on a specific kind of input. It's getting so they can ultimately arrived at that answer of whether this is a dog or a cat or a hot dog or not. And in this case the output of that particular outcome is a dog and what we had before was rule-based engine where you'd have to actually configure the rules that will

if this and this and this and this and this that's a dog which does not scale. So because of that deep learning has allowed us to get into you know, so many more use cases and solve so many more problems that we had that we could before purely with rules NJ. So that's cool. In particular over the past seven years or so our ability and the machine's ability to perceive the world around. It has gone incredibly good. So in this case in 2011, we had a 26% error rate and identifying that animal as a cheetah on the

right. And as of now we since late is less than 3% which is better than what a human can do and that's pretty awesome. The fact that I am in a machine can now perceive the world in that way opens up a lots of you use cases. But of course as a team or mission is about bringing machine learning to mobile devices and mobile apps. So we start researching this product. We went out there and talked to many developers voluntarily a Google as well as externally to try to understand. What are you doing with machine learning on

devices? And how does it work today? I'm going to tell you a bit about that. The first team one of the first was talked about is Google translate. Of course, you have a lot of a Google translate because it's such a delightful experience any credibly useful translate. This is strange together multiple types of deep learning and machine learning models and and Technologies to deliver this experience. So it does the on-screen characteristic extract a text that is looking at it. The actual translation itself. Ultimately it could do text to speech so you can speak the results back to the

user and we think that way you can bring these things together when you can use machine learning and multiple gateways within a single experience where relevant obviously really cool stuff happened last one exam. Gen 1 is this app called Musician musician is the music learning app is essentially allows you to play an instrument in an analog like an actual analog instrument and a device listed as you play. That's right to interpret how well you're doing just listening to the specific notes, you're playing it's listening to her while you're playing

them. It's time stamping them cancellation and is also doing noise cancellation at finally personalized in the learning experience. All of this is done with an on device machine learning model. And in fact plus plus run time to actually do the entrance on device as efficiently as possible. This predated tensorflow Lite and other things that we do have now that could have helped with that particular process. The other folks we talked about among the few others are at Evernote. So Evernote want to feature called Evernote collect and the inside behind ever not

collected the fact that we collect so much of her information now in a visual manner, we're taking screenshots of things that we care about we're taking pictures of receipts. We're taking pictures of white boards after meeting and asking someone to transcribe. Well Evernote tries to avoid that. It's not extracted text and tag the image with that relevant text. You can search for it and find it and make it more useful. Is that super cool? Answer the overall theme for we heard through Mary conversation was that it's doable machine learning is doable, but it's really hard

and it's hard for three specific reasons. The first is acquiring sufficient data in both the quantity and quality that you need. I think about it. So let's think it's raining announcer yard model. You can label the data for your own language, right? You could get an actual training sets and say I'm going to labor what this data says myself because I understand a language but in a global audience when you have used all over the world, how do you create an OCR model that actually works for all those languages? That's really hard. Or even harder if you have

music learning app, you need to hire world-class musicians to record the perfect note. You can actually Trey against it that's expensive. The other aspect is developing models that are optimized for mobile entrance and it has many dimensions. This could be in terms of battery life in terms of compute and in terms of size of the model. What I've learned in machine learning is that these are always concentrate offs. You can improve on one but then decrease on the other one. It's just a really hard challenge to solve and finally good morning Uncle and experimentation is essential

to Mission learning. You can't do one without the other end. So but however, there aren't very many tools today to help you do that. So we set out MLK to try to get it to address all of these issues. Of course, it's the beginning of a long road and journey, but we think that we have some exciting stuff for you today. I want to search talk about our machine learning SDK machine learning stack. And in this case the very bottom of it isn't Android neural network de TI on the Android side on the new network site and we launch date

with omr 1 and especially Hardware acceleration interface since launch when working with everyone start off as those events to build drivers for the neural networks API, which Huawei P20 series device. We're seeing a tenant Improvement and latency of inference with Inception V3. And what's cool if you don't know exemption V3, it's a really large model. It wasn't built for mobile devices at all. It was built for server-side entrance. And the fact that we could run. Kind of model at NXT

performance and actually running quite efficiently on a mobile device. That's not connected to a power plant is actually super excited. This means that we have a lot more Headroom to do more with machine learning on a device. Of course is another dimension of this will be built models that are inherently bills for mobile data from the ground-up bill to be highly efficient and all of that and that work continues to repair these two things up together. We think there's going to be a lot of cool stuff happening here. So we're continuing a nonmetal on the IRS

site. The next aspect here is sensor flood light sensor flood light which was announced last year at IU and then shift around November is a lightweight machine learning and that set of tools and Library. It works on both mobile devices and embedded devices and it was built from the ground up as it says to be lightweight now I'm not going to steal any of the teams. We have a session fully dedicated to test a flood light tomorrow and I highly recommend going to see it if you're at all interested in on device machine learning. Besides you'll be with me if you're

okay. Then we got the application layer. And this is where we looked at. We looked around we said there is a really easy way to access machine learning Technologies at. Layer you have to undergo an interface directly with the runtime and build your own models or you really had to build your own staff as well. That's where MLK it comes into the picture. MLK niece in beta as of yesterday, so you're obviously all can go use it today and it's essentially Google SDK. Our aim is to bring Google

15 plus years in machine learning. All of the technology was developed bring it to mobile developers through this SDK. So let me tell you more about it. What's first let me show me the stock again with MLK other thoughts. And in fact, this is our own device let you know Norman. Okay, I hope I get to tell you about it. Okay? So the first thing that's really important is that MLK is vote on IOS and Android and it was really important to us because when we talk to developers, we don't think

about Android machine learning and iOS machine learning you think about machine learning platforms does not afford their so it was important that we have a consistent SDK for both And in fact, every one of our features is available on both Android and iOS. We offer two types of rust buckets of features one is what we called base apis my back by Google models and satisfies your concern as a developer has no machine learning involved. Your daughter is a set of features that help you use your own custom train models. I'll tell you a lot more about that in a bit.

MLK it also offers both on device and cloud-based apis because the real-time and offline ability, but they have limited accuracy in comparison to Cloud. However on device apis are free of charge. But we also wanted to give you a consistent interface for the cloud apis because in many cases you do need that level of precision and that level of scope and we'll talk about the distinctions between the two in a little bit. I'm finally MLK. It is deeply integrated into Firebase. And this was really another important point for us. We aim to make machine learning

exceptional. We don't want it to be special. We wanted to just be yet another tool just liking use analytics or crashlytics or performance monitoring or cloud storage. Just like you use any of those parts of Firebase. We wanted machine learning to be right there and right then. At what this also does is it works well with other features on Firebase and it will tell you a little bit more detail about that in a few minutes. So that's the high-level about MLK. So what bass apis do we support today?

First text recognition this is available both on cloud and on device. The second is image. Labeling next is barcode scanning face detection and landmark recognition. The 40oz on the left are all available on device. Meaning you can use them for free and in real-time and offline. We also have two super cool features coming up soon. One is a high-density face contour feature. This is over a hundred points and real-time and the others are smart. Reply API and this is what we like to talk about how I Google it works together really well, so I thought of Android P. There's a future where you can now

insert response to Jesse says within the notification shade directly. So MLK's to multiply 80 I could be something that would be useful in populating those chips. This is the same API using wear OS specific knowledge, you know, if your eyes safe technology used in wear OS Android messages Etc. That's really cool. But of course, sometimes you simply need to build a custom model. If you trying to detect that particular type of flower you can is hard to build a genetic model. At least you could build one large glass. You

detect every flower every dog every species of everything. So you have to use base custom model sometimes. And we wanted to help with that as well. The first feature we here we have here is dynamic model downloads what this means is you can upload your model to the Firebase console and have to be served for your users. Dynamically. You don't have to bundle them all into your APK. He has a bunch of benefits first introduces you APK size is now you don't have to put that five or 10 GB model in 3 APK meaning that you have to take the head and the app stores. When

you trying to get a couple the process the ml release process from the software can a traditional athlete processes? We've learned that these teams are typically slightly different teams and team is probably a different set of people than the ones that building your core software experience ability to deploy each at different times. No, a really cool benefit of this also is you can now do a b testing on different models with literally a single line of code using Firebase as

remote config. This is the coolest part for me. Literally if you were to do this today as in maybe 2 days ago before I left as long as you have to bundle two models into your app, you're stuck with us to say more for the duration of the apps lifecycle and you have to upload all the metrics back and do all of that work. This makes it trivial. And given how important it is to experiment as part of machine learning. This is we think a real game-changer for our ability to use machine learning models. And finally, we talked about that optimization challenge of building

models that I made for mobile. We're excited. It will have a common feature does coming soon that allow you to convert a compressed full tensorflow models in two lightweight tensorflow Lite model and where is going to get in stage here in a bit and tell you all about the magic we call it also technology behind. Compression flow. So that's about it Google machine learning SDK available on Android and iOS. I want to take a moment as always to tank are partners

with words with everyone at these partners and many more to launch ml kit that work through so many bugs so many challenges has given us so much feedback and the product wouldn't be where it is today without that help don't want to thank them a lot. Any particular I want to highlight a couple of things so worked at Pixar and Picsart uses and they'll get to deploy a custom model to use their magic effects experience. What's cool about Picsart. Is there using that use in MLK's both on Android and iOS We also

wanted into it and if you know u.s. Tax dates text days around April as they are really pressed for time to get the feature out until work with them to any grade MLK in record time is that was super awesome as well. Alright, so before I hand it over to Sachin to tell you more about that melted. I wanted to make kind of a commitment to you. We're going to go out there. I'm going to knock on Google's research team's doors every single one of them and ask them to bring their Technologies to you to be part of SNL skits. We going to

focus on Vision speech and text models and we're also continue to make use of custom models as easy as possible. Sit up here and tell you more about how ml kids works. Thanks Brahim. Hi everyone. My name is such a funny. I'm a product manager in Firebase. I was practicing my presentation at home with my three-year-old and every time I finish it would say again again, and I'm not sure if she was trying to tell me that I need to practice more or if she's really enjoy the content. I guess we'll find out. When we set out to build a tall kid, we have two main objectives. The first

one was to build something that's powerful and useful and Brahim talked about that. The second one was also to make it fun and easy to use send how much a little bit more about that. So if you use Firebase before you're probably already familiar with product life storage real-time database firestore remote config crashlytics analytics a b testing and more and now there's a new addition to the family starting this week with our launch. If you head to the 5 is console you'll see ml kit on the left knee clicking on that will take you to the main screen with r v

a s a p i just introduced you to as you can see, there are mostly Vision Focus for now, but we intend to add to it and I in the future. Let's look at one specific used to determine what's the content of an image what the females were things are in it? I would use the image labeling API as you can see their two icons hear this indicates that this apis available to run both on device and in the club on devices free low latency because everything runs in the phone, it supports roughly 400 + labels. Now if you

need something more powerful something that's going to give you more high IQ high accuracy results. You would use the cloud-based API that is free for the first 1080i calls for months and paid after that, but it's supports over 10,000 labels. Let's look at an example. If you were to feed this image to be on device ATI you get labels like fun infrastructure neon person Sky if you feed it to the cloud you get Ferris wheel amusement park night. You can see that's more accurate, right? Okay. So remember I told you that it wasn't just fun salsa easy to use

so you have to hold me to it. That's what I want to implement. This API on iOS. I would just include these three libraries in my pocket file similarly from doing Android tablet for these three libraries in my bill. Next if I'm doing iOS on device image labeling, I would instantiate the label detector sale able to text or talk to text and then I get the results back in that handle iterate and handled extracted on Android. The pattern is very similar you instantiate the detector detector. Detect an image on success

identity to the highlighted boxes and gray over there. This is on device. If I want to do the same thing, but call the cloud API instead not much changes. It's just a few classmates. The pattern is very similar the detector detector to detect an image on success handle extract identities. pretty cool All right, demo time. I was warned not to do a demo. But you know, my wife says that I don't listen. So here's me not listening. And let's see if this works okay

to use for things like, you know tagging photos. If you want to know the content of a picture in a still picture usually be pretty cool to show a live demo with with a live stream. So I'm going to take a picture of this toy car vehicle Tire bumper law, you know, it's actually picking out all the pieces there. Let me know crowd it was empty chairs. Okay, so face detection switch to this around my face as you can see, there's a left eye right eye the numbers next to the

left and the right eyes are how open they are. So you can tell that I'm awake happiness is a detective for the smile. So look at how that changes. And this works with multiple people actually, so just you know, you shouldn't trust me. You should ask me to prove it to you. So I need I need a few volunteers here. Can see multiple faces detective eyes where everyone Smiles for everyone. Okay, I have a couple more things. I lose it as Bernie mentioned as one of our partners and they worked on this really cool features. I'm going to

enter here. Let's say I am logging what I have for breakfast. And normally you could select foods that I'll ready the application or you could enter it manually, but let somebody want to answer and you food. Apparently this is not considered food. So I don't think it's in the application, but I'm going to try it. Okay. So like I said, you could enter it manually, but let's drive what happens to it detector faster than what I expected that one more time. Are you going to Tricia labels Bond and here's all the information the calories

bad saturated fats? You want one more? This is stuff that's not available yet. But I think it's pretty cool. This is our fish Contours demo it to text over a hundred points process them and 60 frames per second and you can see, you know, my lips my eyes the darkness Contour. So this will be coming pretty soon. There'll be a sign of blank if you're interested and we look forward to having it in your hand so you can play with it. All right. Thank you. So that was pretty cool and hopefully you find those base apis useful but

there are obviously use cases that you might have been a very specific to your application. What if you wanted to take different types of flowers or like yousician? Maybe you want to extract a musical note. I don't have sound screen. You would want to use your own custom model 3 benefits of bringing customers to NOK. The first one is that the ml kit SDK provides an API layer that interacts with your tensorflow Lite model. So you can provide it was just like you would do with a base apis second. You

can upload the console and that first and get to this earlier is that you can bundle your model with your application if you choose but if it's big and you want to reduce then stall size, you can actually just leave it in the cloud and downloaded Dynamic initial install size is smaller. And the third one because it lives in the cloud. You can dynamically switch the model. You don't have to submit a new APK or bundle to the App Store the Play Store. Here's is the quick snippet on how you would

low. Model how you would refer to it. So that's why I called my model my Model B1, I would put this snippet and it would retrieve it from the cloud. Now. Let's see if that back when I started remember I mentioned that there are a lot of other Firebase products that are very useful and one of them. One of my favorite ones is remote config Doug allows you to dynamically switch values inside your app. It's typically used for things like, you know, switching to color the background you can also use it to switch call to action strings. It's the

really useful for that sort of thing, but it's turns out it's also very useful for a note. So here's what I did call my mom been activated three different Target populations one or people speak English and I don't want for people to speak Spanish and then a default value. So what I'm trying to do here, I'm targeting a different model to those different populations. And once you do that instead of hard coating a model named like this year, you just change that static string for a call to Ramon and then every device depending on what

population they belong to they'll get their respective model. If this is just very simple example, you can think of using a b testing and analytics so you can test out different models and you can pick the one that performs best and choose that to experimentation as rain said earlier is important in machine learning. All right, before I wrap up and hand it over to where I want to talk about this compression in conversion of the air flight models of the earth model. What you need is a tensorflow lite model to run on device.

We have a future for that and it's coming soon as a said you would upload your tensorflow model along with training data. And once it's done processing you get a bunch of tensorflow Lite months to choose from as you can see, you know, these would be compressed and have different inference latency and different sizes depending on what you can take the one that suits your needs. This is currently only for image classification models, but we look forward to adding more in the future. Okay. So just for

you you publish it and then it just available like any other custom model that you would upload on your own. I know I made this seem super easy with like a beautiful you are and just three steps, but this is actually really hard to do. It's an active area research. It's almost like magic and to tell you more about that magic. I'd like to invite a resident wizard way. Please come out on stage. Hi, my name is way my team actually right here. My team has mix machine learning experts and mobile developers is being a

lot of fun to be part of it and build something that we all believe can be useful for sample mother compassion. Now I'd like to go deeper into the technology behind the magic. But first of all, let me explain why we want you to support mother concussion. Running machine learning on the cloud versus a mobile one big difference is that the mobile environment has very limited computacional resource. This makes the motor size and the importance of speed extremely critical. For today's Hardware limit

most mobile applications require very small models. Ideally less than a couple of my device. so for like so like the other hand if we look at the motor architectures proposed by researchers for say image classification to obtain higher accuracy the tangle much deeper and larger sometimes hundreds of megabytes for certain applications. After talking to a lot of mobile developers will realize that how to make machine learning models small and efficient enough to feed found mobile phones is one of the big clean

points. With a small kid would like to address this issue by providing model compression tooling and support. So a model compression service or tool takes a large between the motor as in food and automatically generate models that are smaller in size battery efficient more power-efficient Pastor in importance of speed with minimal loss in accuracy, as you are an active machine learning research area and our compression service is based on learn to compress technology developed by Google research and it combines various state-of-the-art motor compression

techniques. book sample What months are the car tuning reduce is the motor size by removing the last contributing ways and operations in the model? We found that for certain amount device composition of models pruning can further reduce the model size by after two acts without too much job in accuracy. So I'm not a method is trying to reduce number of is used for Mo the ways and activations by conversation. What's that home using a speed six point? 4 motor ways and activations in size of clothes can make the most

important run much faster use lower power and reduce the bottle size by 4X. The best of a claim with tentacle life switching from Monett to Comcast Mobile that can speed up inference by two axle more on Pixel phone. The third of a third trains a compact model called a student model with distilled Knowledge from large Model A teacher model. So the student does not only learn from the brown shoes labels, but also the teacher typically the student models of very small in size with

much last place in the model and use more efficient operations for the benefit for the benefit of important speed. Book sample for image classification the student models can be chosen from mobile. Not not not had any other state-of-the-art mother pictures compact enough for mobile application. What kind of father extended this distillation idea to simultaneously train the teacher model and the Magical still tomatoes with different sizes in a single shots. one thing to mention is that very open

for all these techniques. We need a fine tuning tab for the best accuracy. So in this case, we do not only need the original model for the contraction process, but also your training later. Soap for ML kid will provide a cloud service for mother compression for now. We only support image classification use cases, but will expensive to more The reason that we support motor compression as a cloud service is still an active research area with new technique a new Moto architectures

specifically for mobile applications invented the very fast from B12. It took less than one year to invent. So I will compression service will automatically incorporate the latest advances in technology for you. Another reason the compression process typically takes quite some computational resource hours on gpus so we will run out o clock service on Google Cloud to use the computation power there. So what we need from the developers include a pre trained tensorflow model in safe mode Allure checkpoint format and your training

data-intensive low example for bad for the fine-tuning staff. I just mentioned What we generate will be a set of models with different size and accuracy trade-offs for you to choose from. I know kid is running on top of tents to light all our generation models will already be intense full. I'd format for you to download or serve Alamo the hosting service. So with Anil kid model compression service, we're aiming to compress a model up to 100 times smaller, depending on your use case and original model to give you a real

developer uses as example is a local fishing app. They already have their model to identify fish species and the model is currently running on the cloud. Without mother compression service the original model provided by the developer with 8GB and 92% of 3 accuracy can be compressed too much smaller models with different size and accuracy show here. As you can see in this particular case, the accuracy is off. The generator Tomatoes were even higher than the original model, which is not always the case, but it's possible and it's great.

To summarize with I know kid would like to make machine learning accessible to all mobile developers to achieve that would like to help with every single stop in the machine learning workflow not only how to use a mother but also how to build and optimize your mother. Now I'd like to conclude a talk with the summary of what will provide. We are launching in beta the base apis for both IOS and Android including text recognition image labeling barcode scanning face detection at Landmark recognition.

We're also supporting costume apis with tensorflow Lite model serving. Please check out these features at Furbys website. Meanwhile, we'll have a set of new features coming out soon including high-density FaceTime two radii smart reply PA and model compression conversion service will soon start to whitelist developers to try them out. If you are interested, please use this link here to sign up. We are super excited about ml kid and how it potentially can help

developers build cool machine learning features. Look forward to your feedback and we're committed to making its grave. Thanks for coming and if you have a questions will be available right after this talk at the fair bits and Bobs Q&A your area and we are also having some sessions who talks for you to check out. Finally, please leave your feedback about this session for asking fuel for the future. Thank you.

Cackle comments for the website

Buy this talk

Access to the talk “ML Kit: Machine Learning SDK for mobile developers”
Available
In cart
Free
Free
Free
Free
Free
Free

Access to all the recordings of the event

Get access to all videos “2018 Google I/O”
Available
In cart
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Software development”?

You might be interested in videos from this event

September 28, 2018
Moscow
16
159
app store, apps, development, google play, mobile, soft

Similar talks

Sara Robinson
Developer Advocate at Google
Available
In cart
Free
Free
Free
Free
Free
Free
Jumana Al Hashal
Product Leader at Google
+ 1 speaker
Todd Kerpelman
Developer Advocate at Google
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Jason Titus
Developer Product Group at Google
Available
In cart
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “ML Kit: Machine Learning SDK for mobile developers”
Available
In cart
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
558 conferences
22059 speakers
8245 hours of content