Mathieu is a software engineer working on the ART team. His specialties are Garbage Collection, memory management, and intermediate format optimization.View the profile
Calin Juravle is a member of the Android Runtime team where he leads the effort on profile guided optimizations. He earned a Master's degree from Utrecht University where he graduated with honors.View the profile
About the talk
If you use the Java or Kotlin programming languages to develop on Android, Android Runtime (ART) is what ensures your code runs quickly and efficiently. Learn more about how ART makes it easier to write a great Android app with improvements in debugging and profiling, as well on install and launch times. ART engineers will be on-hand for a brief Q&A at the end.
Hello everyone. I'm Matthew and this is my colleague Colleen. And today we're going to be going over. What's new with the Android runtime on Android also known as art. So what is art well art is the software layer in between the application and the operating system. It provides a mechanism Java language and call in applications. To accomplish this art does two things it actually accused X-Files the intermediate representation of Android applications a hybrid model
consisting of the interpretation just-in-time compilation and profile face ahead of time compilation. Art also does memory management for Android applications through an automatic automatic Reclamation through a garbage collector. This is a concurrent compacting garbage collector so that there is less Jack free applications. Now let's look at how it has changed over the last few years. Over the years there have been many improvements to art. And do got introduce a profile guided compilation to
improve application startup time reduce memory usage and reduce storage requirements. Also in new. We added a jet noise, like doll that used to have this was done to remove the need for optimizing apps. That was the kind of a big problem during Android system updates. Editorial we had it and you can current compacting garbage collector. The reduce Ram requirements have less Jake as well as accelerate allegations. As you can see her on the slide this new garbage collector enable the new
bump winter alligator that is a 17 times faster than the alligator in Dalvik are in kit kat. now you talk about what happened in the past, but what's new in Android p First of all, there are new compiler optimization to help accelerate the performance of calling code in Android. This is especially important since Colin is a first class programming language for Android development. Next up we have memory and storage optimization to help entry-level devices such as Android go devices. This is important to help improve their
performance for the next billion users. And finally, we have Cloud profiles. Device collected profiles from the just-in-time compiler or uploaded and aggregated in the cloud you enable faster performance directly after installation of applications. Okay, so let's start with calling. Last year when I was calling as a first-class officially supported programming language for Android development and then we began to investigate the performance. Why calling you my ask all calling is a safe expressive concise object-oriented language
that is designed to be interoperable with Java language. The reason our focus is on optimizing call in so that the developers can leverage all of these language while still having fast and Jen free applications. Let's see how it, optimizations are normally performed inside of the Android runtime. Usually optimizations are performed in an investigative Manner and there's an order of preference for fixing performance issues so that the most amount of calling applications can actually benefit from the
optimization. The preferred option is fixing a performance issue inside of calling and calling is the compiler develop to buy jetbrains Google Endeavor ends. Of course, we're closing together on all kinds of optimizations and fixes for issues. If we fixed a performance issue here, it'll be able to be deployed to the most amount of Colin applications. Alternatively if that doesn't work, then we consider fixing the performance issue inside of bico converters. They sing in the biker converter will enable existing versions of the Android platform to get
the performance fix. and if that option doesn't work the last option if the fixer performance issue and the average run time also known as art Also, the reason for that we might not want to take the dark right away is because art is updated as part of the Android platform. So that means that not all devices will get the fix. Now, let's look at an example. One example of a Colin optimization is the parameter null check. As you can see here. This is a simple method. They just Returns the length of a string but the string is an olive oil.
So what this means is that the compiler is there's an old check into the function by code the actually verify that the string is not know and throw their corresponding exception if required implemented in the biker was the first step is loading the name of the parameter and then invoking a separate function to do the actual null check. There's some extra overhead to hear as you might see because the invocation in the common case you do the extra indication that goes to the function to do the null check and this function interns if required causes the throat the actual parameter is no
exception. Checks, which of these are commonly required for Java language and call it interoperability because Java language does not have a non malleable property. Now, let's see how we can optimize this. If you look at the bike goes one of the first things we can do is actually in line the method that does the null check into the color. After enlightening this improves performance because there's one less invocation and from here you can see one other thing we can do is that the the name of the parameter is not actually required unless other argument is null go from here. We can do
code sinking the move loading of the framatome game inside of the conditional. Overall these two optimizations help performance by removing one indication and wanted loadings of a string literal. Parson does optimization. We also tried calling performance on various benchmarks other improvements here include at improved Auto vectorization of Loops also intrinsic methods that are specifically tailored for calling code to help me through performance there. So they are team is always work on improving this performance.
Okay, now that we're done called him what about memory and storage improvements? So since artist responsible for Java language and call in applications also pretty important to just kind of make sure that the programs don't use too much memory. I think too much space on a device. Having several improvements focusing on this area including reducing the amount of space and memory usage required by The X-Files. Why are random stores in Port up and storage optimization important? Well recall lot last year. We introduced a new initiative called Android go
aiming at running the latest version of Android on entry-level devices. This these devices typically have 1 GB of RAM and 8GB or less of storage important to focus on optimizing these areas. So that users can run enough applications and install as many applications as in or more applications than they would otherwise be able to Now, this is just for Android Go premium devices also benefit from authorizations in these two areas. But since they have more resources normally is to a lesser degree. Anyways before we talk about
RAM and Storage off Imitation, let's do a little bit of a review about how applications work on your Android devices. An application normally comes in an application package kit also known as an APK APK are usually one or more Dalvik executable files also known as decks files that contain instructions that are used to be either interpreted or compiled your application. Is dead files are required to be quickly access during execution. Their math directly into memory during application startup so that I can have quick access. This means that this there was a
start-up costs as well as a ram cost proportional to the size of a text file. Wiley X Files are usually stored twice on the device the first place they're stored inside of the application package kid. And then the second place they are stored is in an extracted form that are can have faster access during application startup without needing to extract from zip file each time. Now, let's take a closer look at the contents of X-Files. What's in a text file? There are several sections containing different types of data related to the application. But where's the space going into text
file? One way to do this if you can kind of calculate where the space is going for used X-File and averaged out the results. This chart of here. It's where the top 99 most downloaded applications in the Play Store and you can see that the largest section is a code Adam section containing the decks instructions used by art. The next largest section is the string data section and this section contains the string literals loaded from code message names last names and field names. Combined these two sections around a 64% of the X-Files. They're pretty
important areas to optimize. Let's see if there's a way we can reduce the size of these sections. What new feature introduced in Android p is called compact X? The goal of compact desks is simple reduce the size of X-Files to get memory and storage savings on the device. From the previous slide. We saw that some sections are larger than others. So it's important to just focus on the large section. You got the most savings. For the code items there more often do duplicated and they're also have their headers front safe space for each method,
especially inside of the application. And another thing here worth noting about the string dinner is that large applications frequently ship multiple decks 12in their APK because of decks for my limitations are specifically the 64k method limit means that you can only have $64,000 and a method references in a single-deck file before I need to add another one's your application. and everytime you add another deck file this causes duplication specifically of string data that could otherwise be stored only once contact drinks this by providing deduplication across the deck belt in the UK.
Now, let's go to the generation process. First let's look at how the X-Files are processed on Android Oreo. The first death run by Dexter road ahead of time from Pilar because it smells are extracted from the APK and stored in a Vitamix container. The reason are attracted as I mentioned earlier is so that they can be loaded more efficiently during application startup. One other thing here worth noting is the profile the profile has introduced. The new got is actually about the application execution including what
methods are hot. So compiled by the compiler and what classes are loaded. Honorio, we are already optimizing FedEx files stored in the VTech container while applying layout optimizations. And also we were deciding which methods to compile based on what now, let's look at text processing on Android p in Android P the out-of-town compiler and I'll Converse the deck spouse to a more efficient compact X by representation inside of the container. When do U Edition here is the introduction of a shared data section specifically
will be only once so it's kind of a shared and one of the most commonly shared things here is the string data. This is how we can reduce the large spring data section that we saw earlier. Finally since the conversion is automatically done on device. This means that all existing applications to get the benefits of compact text without needing the recompile therapy case. Okay, so let's look at my example of how we actually shrink the decks code atoms. Apart from the instructions
has a 16 byte header. And then most of the dollars in the header are usually small values. So what we do here is restrict the fields and I had heard that before bit each and then we have an optional free header to extend them as required. The 300 0 bytes in most of the cases but can't be out the 12 ice in the worst-case. So other than the preheader we also spring construction count. Inside the average message not going to be that large. We shrink this down to 11 bits instead of 32 bits and use the five remaining this for flags that are
specific. Finally remove the debug information into a separate space efficient able to help enable more deduplication of the code items. Overall this optimization saves about 12 x 4 code item in the compact next file. And your other results for the top 99 most out at apks does the average space required by the text files on a device is around 11.6% smaller and then all other than the store savings, we also get memory savings because the files are resident in memory during application
usage at least partially residence memory. And one more thing here, let's go over the layout optimizations a little bit. So even though we had introduced the jet profiles Android and we did not have any layout optimizations back. Then what this means the decks is kind of randomly ordered and not disregarding usage pattern. Inadvertido we added this type of layout optimization that groups the methods used during application startup together in the message that are hot. The music code is frequently accessed during execution together. This
seems like a pretty good at a pretty big win so far unless you are we did for Android. Integrity, we have more flexible a profile information which enables us to put the message that are used only during startup together. This helps reduce the amount of memory used because the application or the operating system can remove those pages for memory after startup. He also put the hot go together since it's frequently accessed during execution. And finally, we put the code in just never touched at all during execution at the end. So that is not loaded into memory
unless required. And the reason that these layout optimizations are important is because they improve locality and reduce how many parts is a deck Valor actually loaded into memory during application usage and start of so if you improve the locality here, you can get started benefits a memory reduction and a reduction in memory usage. and now the Killeen or Cloud profiles text Matthew, my name is frozen and I'm here today to present you how we plan to improve and scale-up the Android.
However, before we start profiling is a rather overloaded term. What is the speak about profiling in today's presentation to a profile? We're going to see how extend the on device capabilities in order to drive for 4 months right against all time. Before we jump into watches you and how she is worth. Let me brief you remind you how Android uses profile daddy took my vacation as part of a hybrid ignition model hydrogen is the zip code to be executed can
be in three different optimization States at the same time. The primary goal of this technique is to improve all he measures of the application performance. We're talking about Foster application startup time reduce memory footprint a better user experience by providing less junk didn't use it and battery life because we do have your options when the device is not used rather than at the used time. How does this work? It's all starts when the Play Store install the application. But first we do very very light of the migration and you have the application ready to go
for the user. At first launch the application to start in Waterford and interpretation mode. Adirondack exit is the application code discovers the most frequently used method in the most important matters to be optimized optimizer. During this time the Jeep system also record what you call a profile information. His profile information has been slain cross-play beta about the method and about the glasses that are being loaded. Every night and then we dump this profile to do so that you can reuse
it. Later. What's 2 / 5 + is nothing used estate what you call idle maintenance mode? We're going to use that profile to drive profile guided meditation. The result is an optimal Optimus app will eventually replace the original face. Know when to use the relaunch is the app. It would have a much Netgear start of time. But much better said if they perform at execution and overall the battery will drain last. In this day job application would be interpreting
just-in-time compound or free optimize. Not just how I feel as if technique. We Gather some beta from the field for Google Maps application The left one present data from a marshmallow build time is pretty constant over time. It does not fluctuate. However on the right hand side, you can see that in you got to start applying jobs over time. Eventually stop light is off of being about 25% faster than it used to be at his toes done. And this is great news. The more the user uses the app the more we can
optimize it and overtime the performers get better and better. This is great, but we can do better and he want to do better. There shouldn't be here. We shouldn't need to wait for Optimal Performance. And our goal with Cloud profile is to deliver near Optimal Performance right after install time without having to wait for job application to be profile. So, let's see how hot is it going to work? Let me introduce two days yellow flowers profiles. This is based on Main to cure
starvation. First one is it's usually ask as many commonly used coasters that are shared between a multitude of users and devices. Example glasses loading during startup time and that's available data for a stroke to my cell phone. Becca we know that most app developers roll out there after incrementally starting with alphabet a channels or for example one 2% of the user base. MJG of behind couch profile is to use this initial steps of all sabetta Channel users. job performance for the rest of the users the husband at work
Once you have unusual set of devices organized structure the profile information about your APK from those devices to play. And there were going to combine everything. Aggregate whatever comes CNN was going to generate what they call a court application profile. It support profile will contain information is relevant across all devices diffusion and not just the single one. When a new device request for that application to be installed. We're going to
deliver Discord profile alongside the main application if you say to the device. Locally that device will be able to use that data to perform profile guided optimization right at install sign in much better better steady state performance of all time. Now accepting profile. It's in the cloud over much more opportunities than directly influencing the a performance with optimization. The core profile over developers data for example for developers to ask the pain. Any believe there is not information there. So they develop work BenQ and their own
application Explorer how I can share this data later? Now you can see in this workflow that deliver such a thing. We need support from Android platform and play alive. In today's presentation going to focus on Android support. So what did you do in fear to support his life cycle? We added you interfaces that will allow us to check the profile and bootstrap the information from the cloud. the functionalities available solar system level app which acquired the necessary permissions and
you're not working, please just a consumer. I got time to think about our profile extraction and these are exposed their new platform manager recorded art manager. II HEI is profile installation And it is seemingly integrating in their current installer session. What he did here is to add the new kind of installation art at the platform understand. week-old next metadata file eventually in a similar way to jpg the text metadata files or archives which would contain information in how the runtime can optimize application. Initially the best method of file to contain the core
profile that I mention about earlier. What is full-time pay will deliver his files if they are available to divide a line into the optimizer on device? It is worthwhile mentioning that will offer support for Google Play Dynamic delivery. So if you plan to split the functionality of your application of indifference old APK, so have their own fix metadata files. So let's take a look how everything fits together from the device perspective. You remember that I presented this diagram in the beginning showing how the profiling Works locally? Let's focus here
just on the profile file on an application. Once once we managed to tap to the profile file, we're going to ask the disinformation to play. One-player Dimension aggregated data is very many other profile profile for new users profile. Gadia of the corps profile is not to replace on device profiling. It's only to bootstrap GoPro file organization. Instead of starting with a completely blank space about your application. We already know what are the most common language do the first past and
would be able to start up to my patients from there. So now I finish Lee what was the fewer on device profile feedback loop? No, I keep talking about this for profile. And I think it's important to dedicate a bit more attention to it. So that's how we going to build it. We already know that only buy from one institution to the other the profile Is Not Great White. Well did he just say between pretty fast is not optimized application over and over and over again? Paparazzi optimization steps.
I have a data from one device how well does it work? When you try to do it cause devices how many samples you need in order to get to a robust reliable profile? We looked. Adult on Google applications and it tried to figure that. Which represents the amount of information the core profile relative to the total number of congregation. The y-axis represents the amount of information and actual value measured value is not important their what is important from this graph. Is that the
athlete from 20? 24 seconds of summer the information in the profile reaches a plateau and it's very important. It sends a very important message. It means that the alphabet a channel users will provide us with enough data to build up your profile and it means that the majority of the production users of your application will always have the best possible experience. So how do we actually aggregating information? I mentioned before that in the profile, you'll find information about classes and methods.
On device, this is roughly how it looks like we're going to take all the options that we have seen before they create the union of everything at the skin. In the obligations profile you have information about classes matters about everything that you steam. On cloud. However, we don't really want everything. We only want the, executive. And what we are doing instead of having the union will be having a smart intersection with the only keeping the information relevant to all executions.
Meaning to throw out old yelled liar. The result is what you call Decor profile only keeps the most common is team Temple and this is what's going to go on to get eventually to The Divide. Hot Wheels. Just work listen to call Janet data captured from Google apps. We tasted gets across a variety of application issue are the result of some representative one that you can find application. Would you like some ladies code for example, Google camera? Or application which
have much, much more job-oriented Google Maps. Google camera for example to get the start of time Improvement of about 12.6% Excellent, even the application. It doesn't have a lot of java code for toxic are heavily. Java Bay we can see the dogs to my vision improve the start of time by about 28% or 43% across-the-board you can see an average of about 20% Improvement and it obviously depends what the application is doing how much Java code is being used and thawne Now I'm making in the beginning. besides improving job
application performance Direct Media profile guided optimization the profile of much more opportunities I'm going to present a short UK study and work it with some important aspect that the profile can reveal about your application. You think it's just a thought? I'm going to focus on a single question to the client. audio Let's take a look at some data. Again, reflect the state of for some Google app beta testing. We see that on average we profile about
14 15% of the code in about 85% of the code remains on profile. When you spread the distribution, you can see four examples that and some have five to 10% of the project profile 50% of the cogat profile. And this is the rather intriguing result. And the reason for that is that if the code is not profile the most likely means that it might not have been executed. What president served for a good case? I mean the code for example Jim the unexpected error code pass, right? We all want to be reliable and report and their error handling must be there. Hopefully never get executed.
You may have backwards compatibility code support for previous AJ level in WhatsApp. I also have a lot of unnecessary codes flying around maybe by including libraries that you don't really use. Now it's just hard to break down the percentage for just this categories and there can be other reasons why we didn't profile the code but it's cute distribution. Here is a strong indication that is a lot of room to deal of improvement for a TJ. The code can be reorganized
or thing down for better efficiency. For example, Google Play introduce Dynamic delivery scheme, which may help you reduce the code that you share by targeting features only to certain users and that's something that you might want to look at it and take advantage of So we believe that there is quite a bit of unnecessary code lying around at least an hour only case. Not since we focus on the Bravo on the code, that actually doesn't get profile. Is there anything that you can extract out of the profile code?
Do you understand? Let me talk a bit about different categories of profile code. Wendy application cold is being profiled the wrong time was right label it depending on its face and you have a label for a category for the food and for the hot category open cities are pretty self-explanatory to run time seems to be the most important part of your code. It's important to keep in mind that these are not disjoint examples that the executive Market spot. For example, if you have a very heavy computation
method. Nice, you know the code if you think of anything to him. Of time, if you focus on that, you would be able to lower the start a timer application. The first impression that the users would have upon your application is very good. If you look at the poster top coat it will help you. For example lay out the application next bicycle improvements devices. As for the hot code this is the code that you get the most attention for your optimization support. If the code that is most heavily optimized by the wrong time. It
is my baby. So because the runtime identify that it is very beneficial to invest I'm there. And it's what you do for example start with try to improve the quality and performance of the Arab. This is why you should thank your effort effort. Not for this is important like how much goes off your applications last Lady in Market spot? Because if everything gets hot, then everything can be optimized so just not really useful. Let me show you the breakdown of the three categories. Increase graph you can see on the red column
the percentages for the profile code and the most profile Code 100% is what I showed you earlier. You're just for Direction. The blue boxes show the percentages of the start of gold and a hot one relatives. So don't expect it to at 100% Also one part one piece of gold Champion different categories at the same time. But you can see here. The average on average about 10% of the application deck by code is being Marcus Hut and this indicate that when you focus on your Apple to my vision, you can dedicate just starting with just a small part of your application could be Obviously she
spent time with all the other parts as well. I told you this is where you should start from. Let me go for a quick review of what we presented today and the main benefits. We started McCausland and you describe a few new computer optimization that we added that focus on cutting performance. We describe briefly how we approach causing optimization and that we first try to stick Improvement in the cotton compiler. Remove to memory and storage optimization and Michael Matthews introduced the concept of
composite decks. This is the new text format available just on device and we'll talk we spoke with you on the memory savings. And finally, I presented you that idea of cloud profile. Any talk about how we can bootstrap the profile that you don't see my vacation using a small percentage of Alpha Beta Channel users in order to leave important performance Improvement right after you told time for the majority of the production users. Is this I'd like to thank you for your attention and for president. And I want to invite you all to Android
runtime office hours tomorrow or we can answer any questions that you would have thought about this presentation. What about the run time in general? We're going to be off at 5 inspection a thank you so much.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.