About the talk
Presented by: Kangyi Zhang, Brijesh Krishnaswami, Joseph Paul Cohen, Brendan Duke
but two can I use one of these preteen models and swell as you know, we will delve deeper into the text talk and roadmap. We will show you what the community is building with tensorflow JS that are some very impactful applications that are being built and would love to show you some examples to some resources that you can start exploring. Alright, so we have a lot of exciting content to cover. So let's get started. You may have heard the keynote an overview of the technology today morning.
Okay, so this provides multiple starting points for your needs so you can use the library you can directly use off-the-shelf models that the library provides and we'll see a lot of days in a bit. You can also use your existing python tensorflow model with or without conversion depending on the platform that you're running on. Are you can retain an existing model with Sansa learning and then customize it to your data set? That's the second in a starting point typically needs a smaller data set for 3:20. So that might
regular webapps Progressive web apps are covered on mobile with media platforms, like a WeChat. We have just added first class support for the react native framework. So apps can seamlessly integrate with the gas on soba tsps Hunter node. In addition. I can talk on desktop applications using the electron framework. All right. So why you can't upload a s y Ron ml in the browser I so we believe there are compelling reasons to run on a browser client, especially for model influencing. Let's look at some of these reasons. Firstly there are no drivers and not nothing to install
you in every time by script with a package manager and that's it. The 2nd of Ronda just so you can utilize a variety of device input and sensors using standard, you know, / microphone GPS through standard web API test email API and through a simplified set of TF data API, and we're going to see some examples today. Bfgs, let you process data entirely on the client, which means it's a great choice for privacy sensitive applications. It's a boys round trip latency to the server.
Is also a real activated so these factors and combination in combined to make for a more fluid and interactive user experience. I'm also running a mile on the client's helps reduce suicide costs and simplify your serving infrastructure. For example, no online ml serving scale to increase in traffic and so forth is needed because you are offloading all your computer the client you just host a ml model from a static file location, and that's it. On the silver that are also
benefits to integrating principal JS into your North vs environment. If you're using a note a serving stack it lets you bring ml into the store as opposed to calling out to a python weight stack. So it lets you unify your sewing stock in all energy as if you're not using no tears. You can also use your existing court and supreme models to bring them into nodejs. Not just a free in a fabled aftershave models, but rather you're in a custom models that are python tensorflow can be converted and upcoming released. If you don't even need that I can just use
them. I find that you can do all of this without sacrificing performance you get CPU and GPU acceleration with the underlying principle C library because that's what you don't know what uses and you're also working on GPU acceleration via opengl. So in all that removes the need for depending on how could I was as well? So effectively you get performance at similar to the python Library. So so these attributes of the library enable a variety of yusuke surplus the
signs of a spectrum. So let's let's take a look at some of those beautiful features that need High interactivity light augmented reality applications accessibility in support on the server side of the spectrum. It lets you have your more traditional MLP pipelines. That's all Enterprise like use cases and in the middle that can live on the silver on the client are in applications that that do sentiment analysis toxicity and abuse protection conversational AI ml assistant content authoring and so
forth. So, you know, you get the flexibility of choosing. Where do you want your ml to bomb on the plane on the stove work either either or So what about your use case is tensorflow. JS is production ready? So without intro I'd like to delve deeper into the ready to use models available in 10 support. Our collection of models have grown and is growing to address that use cases that we just mentioned in our image classification for classifying hole images of segmenting objects
and object boundaries and recognizing speech commands and common words from 6 models for its classification Cassidy cinnamon to text. You can explore all of these models today on GitHub. You can use them by installing them at 10 p.m. Or buy directly including them tomorrow. Scribs. So this is the Postmates model it performs pose estimation by detecting 17 Landmark points on the human body. It support about the single person and multi-person detection in within an image that are multiple versions of this model VS that are backed by mobilenet and those
back my resnet that provide options for balancing accuracy. What model size vs. Latency depending on your needs. And it enables use cases like gesture-based contraction augmented reality animation and so on things that you know ML on the plane, you can explore a demo of this particular model that are in the world. Another human model is the body pics model enables person segmentation in both single and multiple person synonym. H8 identifies 24 body parts such as left-arm right-arm torso Left Right
legs and support it also provides convenience API to the segment and Mark each body part in a different color, which is what you're saying in this particular. Jeff can be used for things like living spaces in an image or blur the background say to protect privacy. SSD object detection model enables object detection of multiple objects in an image 90 classes to find in the Coco dataset. The nice thing is a Texas import any browser-based images like an image or video or a canvas element and written scenario voting boxes with the detective class and the conference is a sample image.
You can see that the kite is detected with a high conference score and to get a bounding box. That's coming soon to be released. So what I'd like to show you here is how easy it is to use a model like that like in your job after Pap without dealing with sensors Transformations or layers are people's mouths. You script Source the light. In your heart from the hospital again. You can just npm install the library and usable to like we're back at wondering can you load them all and you
and you call the model that detect method on on the image element that you are trying to analyze and that's it or do you get back is a is an array of objects that have the the bounding boxes and the and the classes that are detected for 5 lines of cord to tube to end up leveraging a powerful and all Marvel in the browser. Next models are another useful set tfjs has a toxicity reduction model more gentle Universal sentence encoder model. So let's look at
I want to show you a live example of this model in a web app. Give me a second. Okay, so here is a super simple weather app that simply loads the model and in El Paso has a few sentences to it. And what the model does is classified on a few Diamond Jim's like does it signify an insult is there toxicity and so forth. So let's try an example here. So I'm going to say something toxic your If I can see it. ignorance so something that I would think It's traffic. So
that returns as toxic as well as returns and insult this this model. I want to show you this is contact space not keyword-based. So, you know you you type the same word. In a in a totally non-toxic concept contacts. You're going to see a different answer and that's protected. As you know, I'm not an insult. So this sort of model can be used in on The Blind Side in in product reviews in Owen in chat type of situation example. I want to show you from video who integrated into
into a chat app, and it's able to detect that they know, and I put them right before sending. alright, so Slight another interesting model is his face match this model provides high-resolution tracking a facial features soda text about 400 points on a person's face. So we believe that this model has great potential for real world application. For example, detecting facial gestures emotion supporting Richard accessibility features. And so nobody like to show you a couple of cool demos, but using face match, What's up, and you might have seen this in the
keynote session to the morning. If you attended an application pills by one of our partner teams at Google at this app is called lip sync. This is a game that tracks how well your lip syncing lip syncing play song all we are time in the browser it so let's hear them all. I noticed a particular howdy display turn gray when the lip sync doesn't matter lyrics I know the score matches in a whole weather for sun is lip syncing. When the lip syncing or lip syncing correctly and
this you can see yourself is a type of example that you know, what can build entirely on the client using using this library and I'm sort of let your imagination go from there so I can discuss this is available at both. Welcome to try and see how we do. The next application would like to highlight a virtual makeup try on app again using face match. So this is a mini app that the company modiface subsidiary of L'Oreal has built an apartment or a WeChat
platform. I would like to invite Brandon Duke from Body Face to come on stage and and show you how that you sent about the Earth. So high and so my face is augmented reality for beauty company founded in 2007 in Toronto Canada and acquired just last year hundred percent by L'Oreal. So today my face collaborates with 20/20 Beauty Brands subsidiaries of L'Oreal Maybelline and your fighter technology in such online retail giant says Macy's Sephora or Amazon. So now I'm going to talk to you about how would you make use of Child Support as and are virtual virtual
try-on applications? So in order to introduce why we need why we need a framework like that's what I'm going to use WeChat mini program WeChat mini program for makeup virtual try-on that we developed as an example to showcase the kind of challenges that you run into when you're deploying real-time virtual try-on systems. so first of all Are we going to play or applications on the client-side? This is for new user privacy and to avoid the latency from doing a round trip to back-end server every frame.
GPU Harbor acceleration sensor. Is there a WeChat WeChat mini programs have a to make it by cumulative file size. So we need to find a framework that that's small and allows us to develop a small model. Is it so that we can load it is as quickly as possible as well. And through the Wii because our models have some custom operators. We needed a framework that extensible with custom operators. And we also knew zip framework that it's going to support all the different mobile phones that are supported by multiple males
are supported by WeChat itself. So now I'm going to talk to tell you about how tensorflow JS fit the bill and was able to overcome some of these challenges that we ran into in deploying our makeup for children. First of all that's left side and it makes it because it makes you so wet gel back end table to harness the hardware acceleration and gives it its like an order of magnitude speed up over over browser-based CPU Solutions such as webassembly put your kind of limited by right now by lack of Cindy
in multi-threading. So I had to go definitely a lot of our apartment and second of all the libraries small and compact the library in about 700 kilobytes and combined with are roughly where are killed by model sizes. We're able to fit everything within the WeChat mini program file size limit. And 30 volt wiper support for building keep lowering operators and also allowed us to extend it with our custom operators that we need for our peace dragon and finally tensorflow JS is supporting a wide variety of mobile phone models and as continuous support from the tensorflow jetski
for reasons. We chose tensorflow JS is the framework to deploy our make a virtual. Try-on the WeChat plug. So now I'd like to share with you some of our results. So with the help of tensorflow JS, we were able to successfully deploy our 62-point a WeChat are easy to use real-time system for real Realty AR virtual try-on makeup The entire Final Solution fit into without 1.8 megabytes, including the the code and are models. And on an iPhone XS are rendering and tracking together run out of her 25 frames per second. So classical jazz
festival Jazz coupled with our tiny CNN us to deploy to web our fastest smallest makeup virtual try-on system to date. So now I'd like to briefly mentioned a few future research directions that we have going on in my face. So we've already used caterpillar asked to create web application demos for our makeup makeup. Try on our hair color. Try on that are nail polish dry on any particular for a hair color. Try on we were able to achieve an order at order of Maya to speed up on a guided filter operator that we used as a post person who stabbed by just taking our
webassembly fermentation about operator and beautiful medical attention. I might have to about a 20 second production and latency for the whole system. I will also have a number of other research projects going on at Moda face. I just hear style transfer virtual aging simulation is good analysis. So we hope to use tensorflow JS to deploy in the near future. Thank you everyone for listening and presentation on dick. Thank you Brenda. Hi, my name is Kanye software engineer on the next I want to walk you through the workflow of developing
just saw the Maldive faced lipstick commercial tile and I want to show you the details of beauty which uses augmented reality through the camera real a page. Computers on glasses virtual try-on, there are several components first. We need a model that is trained to find the face and second. The mother needs to be loaded in the app to run and search the app needs to gas video data from the camera and some pre-processing is required. So the video data is compatible with model as input. And after the empress we need some post-processing to use them all to put Toyota motor output.
And the final is the sunglasses need to be rendered based on the amount of output. The first type of challenges with gas included her from the camera tensorflow tortillas provides data API, which enables developers to easily cast a star from web camera microphone image text and sesame final this will also prepare the data as Panther. So it's ready for the mother-of-four Moto with compromised configuration. And the second technical challenge is to detect the face in the camera and find key points on the face previously. We have seen the face next model,
which is pre trained to identify up to 400 special coupon in 3D coordinate and it has the quit model for this task. The third speech challenge is to pose pose as the mother of God and this playlist and grab it after we got the key points on the face. We want to Randall. This is the right place and we find 3 as over which is an open-source cross-browser Library used to create and display animated 3D Graphics in your lap is the rapture will be used to run to the sunglasses on two users face and Slots
were filled out the a state ID. I answer that can be consumed by a model with round face mask model in the Dolce has wrong time to detoxify and then use them all the altitude to prove to the sunglasses to put the sunglasses graphics on right place and finally be used without you have to render the sunglasses. And now let's start coding as we started by loading the library and the face match model from our hosted last we add elements to hold a laptop and another container to hold the rendering output. And here's how we will use the modem first. We load the model with a synchronized play
and then use the tens and then use the model to do I am for the inference output is a Json object containing the official key points in 3D co-ordinates. And here we prepare. Sunglasses image to be rendered in light bulb camera video to display Waze three. There's we need to have a steam continuing. Sunglasses image a camera containing the video and the wrapped around the room so that we can render the same within a camera and toothpaste and finally would create a loop through requestanimationframe and then
it flew open WeChat save me from the camera predict facial key points in the frame and render the sunglasses on to the video and let's see how the app finally looks like. Thanks, so. May 1st at the office smash render result How does the refresh the page on a Reload Amado? and it takes several several seconds to load the model and So you can see it shows the jagged key points on my face and also older sunglasses. Okay, and this model is this time always filled it with a prepaid model we provide and we also
problem. There is no mother available. We provide layers API which is a carat compatible API for bringing a puppy and the lower level up to call Api. If you need pie control of model architecture by execution, and let's see how to build and remodel from scratch with tensorflow tortillas. First stop is to impose a model and if you are working in no. Yet you can also use the tfds dash knob Library which execute the tensorflow operations using native compiled deposit code. And if you are a system that supports cool. You can
also impose bfgs - know that slip your library to get CPR celebration. When do we ain't raining or inference? And this is what creational convolutional model for an image classification has to look like as you can say it is very similar to carrots code in Python. We started by initiating a sequential model. We are the outfits off one layer of improved out to the next year. And then adding a 2d convolution neural layer and the max pooling operation with configuration and then finish them all the definition by adding a
flattened operation and then Slayer with the number of optical caucus. And once the model is defined we compiled tomorrow and it got it ready for training here with that allows pumpkin and a reminder for the training process. And what is the function that describes the training? It is amazing function. So we want to wait for the results. What's the model is Tom Sweeney? We can save them all your ways David to the browser local storage. We also supports TV into a number of different destinations such as a remote to a higher.
Alpine Olympic as use model. Credit to get a result from the train model. And that's why I want to show you the text that an upcoming features in times of Philadelphia. We provide three layers of API the top layer is the printer in the mall. It's ready to use out of the box in a middle way provided. There is API to easily build and remodel Allen the Lord. Are we provide at Opryland? Call Api. So users can do fine control Komodo architecture of for linear algebra calculator. Anne Klein size including on browser
and on mobile platforms around the hybrid to tell us winter vacations. Our library is using about the off activation. It's still a virgin and automatically use it. And I'm sober side in no. Yes, we use tensorflow GPU and CPU C library under the hood. We Were released parts for how did it feel back and as well, which will provide the acceleration without dependency on Cuda. And this is the core temperature of cancer. Yes. We have multiple exploration options for machine learning operations faster on both client and server
side. You can bring python Keras model and loaded with layered API. Are you tensorflow state motto and accurate age with call Api? And this is the performance that Farrakhan client-side on laptop and iPhone the mobile 19 first time you can see it is comparable to cancel the light and we are working hard to improve the performance on Android. A service I didn't know. She has which is using tensorflow State Library. You can stay in the team France time is also comparable to tensorflow python. We just had a album released for react native support.
You can use tensorflow. She has directly inside react native app with lap the acceleration and load models in the same way as brother. And this is a react native Dunlap performing style transfer image. First day taking image of the picture and then take the style image. And this is the final result. And we are very excited to announce that Google Cloud automl tensorflow. You can train customer mother was using an object detection. All you need to do is to upload the images and labels in the all-time L
page and takes care of creating the best model for your training and the provide evaluation details. You can also choose whether you want higher accuracy of fat prediction of battery. And opportunities that you can export the time and use it in your application no more love beauty and although through clicking in the Google Cloud platform. And this is the sample code of using the model from Google, We provide a model and you can use it in the same way as all the other mothers in Keizer bro. As we mentioned they are working out the batter users have already
impressive performance Improvement. I mean a future we will bring more planting tomatoes based on real world use case such as auto reply and the conversation understanding. We are also bringing usability improvements for service Glenwood Vista Palms native stay with Mother execution without conversion and we are developing you back ends with Y and webbed review to improve the performance in browser and react native will have a full release soon. And the tents are Philadelphia Community is building also man inspiring application using machine learning
in browser have a radiology assistant to to analyze passwords and make these predictions inside a browser app to tell a Doctor Joseph Paul Cohen from Molalla to come up on stage. Grit so if we take a look at the traditional diagnostic pipeline, there's a certain area where Physicians are already using web-based tools to make and help them make diagnostic decisions about a patient's a future for Kidney donor risk for cardiovascular risk. These are already online
as early as 2006. So with advances of deep learning making a diagnostic predictions from chest x-rays using deep learning the next step is to also put that online in a way that's usable by these doctors. You can imagine such use cases for this in emergency rooms or humans are time-limited. So you want to have to make less mistakes, especially if they are focusing on something and I don't focus on something else that maybe it may be important but not their immediate concern because of rural hospitals and all over the world that can
access things through the way. Maybe and there's no just no radiologist nearby to help them make that decision. Right? Maybe the country doesn't even have remote resources to Aid in them making a decision. So using these tools could be the closest opportunity that the position has for a second opinion for they make a decision on the course of treatment. You can also imagine non-experts being able to triage cases for a physician to see things like pneumonia or pneumothorax or things that should
immediately brought is brought to the attention of a physician in maybe there's there's two hundred cases to get there in the morning and six of them have a really life-threatening. Results that they should be able to see in those so they should be looked at first right so we can eat in this as well as identifying rare diseases to something we're still working on but this is a kind of a nice traction that this tool can do. Great has nice why we need to kind of put a chest x-ray
tool in a browser. So we could ship a desktop application that would take money. We are we don't have any money or university have any money for this. So we we need to be able to do this in a way that is free. Right and we also can't pay for the competition of processing all these x-rays show me some free web-based tool. We couldn't have a can of a serving server and actually does the processing in a sustainable way forever. That's not rely on donations. So in this way, we just want to offload the cost to to the user's
device and have them installed software themselves from GitHub is probably not something a physician is going to do. So instead we can deliver all the code in a web browser. Absolutely. No setup. They can run on any device that has a web browser essentially is everything Chrome runs on also works in Firefox and Safari, but we're able to deliver this in a very elegant way without any without any set up an excuse where we have to give it all we have to give away the store for free because when we start charging
money we go into this regulatory space, right? We just kind of the reason we do this project in the first place that Physicians and radiologist are scared of these tools because company say they work really well. The performance is not a hundred percent. Right? We should be really honest as researchers talking to Physicians make sure they really know the extent of the power of these tools are so they can really see how this can impact them. The kind of bridge the gap and make sure people are afraid of these tools. So
getting these things into you know in front of these radiologist they can just play with them is a challenging is a lot of stuff in the way. So the best way is to just give them you are all that can go to nothing stands in the Way Apartment is no red tape at the hospital has no money that needs to be paid to make this thing happen to really really enables that use case and there's really no other way to do it unless you got the doctor to sit down in your lab and you showed them on your computer, right? So really game changer. We can compute Destiny browser in
1 second. Once it's loaded. We also need to do a distribution detection. So it's an interesting challenge for the kind of expectation matching of the physician radiologist. And Ayo and Te'o so we don't want a process images of cancer or irregular bones want to make sure the only Creek x-rays go through so we maintain a certain level of accuracy we do. This was an auto encoder. I don't go to the great we all wish we also running the browser. We also need to compute gradients. So what do I have to do this?
We want to show a Tsum Tsum app and to do that. We need to compute the gradients of the image pixels to the output we could ship two models. That would be like the simplest kind of code way. We should one that predicts the actual pathology and another one that just confuse the gredients for the input image of work and it's kind of annoying so we can do instead is just perform Auto diff in the Bowser to make the new graph which competes the gredients right which is kind of magical and then we compute on that graph we get the gredients. We also do that with tensorflow. JS. So get thank
you. Okay, Joseph and another example to I want to show you is developed by the community is not open to a group at IBM who gave a talk on this yesterday is developing or parasites detection web app which runs an image classification model in the browser. So there is a to deploy and the wrong of fly in the field. And also the library was launched last year and then this match will release version 1.03 have seen here to the option by the community with the impressive download and use
our API documentation provided. Our code is totally open and you can find them in a bread bowl, and if you have any questions, or either you can you mail out that tensorflow JS at google.com as center fielder, and you can also try and thank you.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.