About the talk
ProGuard keep rules are the super power in reducing application size. Correctly specified, they allow tools to remove unneeded code and obfuscate applications. But what exactly do these rules mean? This session provides an answer by deep diving into what happens inside the compiler based on those rules.
Hello everyone. Thanks for coming on Stephen. I'm a software engineer at Google will work on compilers and run time systems are most recently I've been working on our age which is Google's new shrinker for Android applications. Now if you attended yesterday's session on your competitors in Android Studio already heard a fair bit about whatever it is and how you can use it and I don't want to talk too much. Actually what it is. I want to focus more on people's is the configuration language that proguard uses to specify the things you
need to keep in your vacation was great. We decided we wanted to use the same language because he wanted to be a drop-in replacement for proguard the idea being that he could just reuse your existing rules and try. All right really easily. Nice in my Dimension building something that is comparable to that degree required us to quite deeply understand what program was actually mean and today I want to share some of that knowledge with you know, what you want to take on a little journey turn the stand what it really takes given an application to come up with effective people's to shrink
it to its minimum. But before we go there, let me ask you all the questions. So who have you had fuse program before in some way or shape? Wow, that's a lot of people. So who of you is using Pro guards today in a released production app on the Play Store? That's her a little less but you a great job. Thank you for doing that because our estimates show that only about a quadrille replication from the Play Store actually use keep rules. So there's a big room for improvement and I want to talk about why it actually measures to shrink application size sofas for me it matters
because I helped build a shrink. Alright, I'm interested that somebody uses it but you should also better for you. There's been a lot of talk about next billion users entry level Android devices and how we can make that user experience Metro. I want makes an entry-level user Android device to one thing is it's really resource-limited. So if you think of these devices they typically have less is 512 megabytes of Ram or they might only have like 4 gigabytes of storage and quite often people that use these devices on areas where they have very limited connectivity
so far. This user is actually makes a daily difference right when they decide of all the apps that actually want to use which ones they can afford to install they might be a small subset and that's bad for them because it's about user experience, but it's also really bad for you with application developers because it might be your app that doesn't make the cut it might be your app that they actually want to use but they can use because they don't have the space. Now you might say okay, that's not really my audience. Next billion users entry-level. That's not where I see my
application. But even if your target really high-end devices the fundamental truth that smaller is always faster and that's fine because I download faster because they're smaller. There's less to transfer. They also installed a lot faster because there's no code to compile and that's what takes the time but installing. And lastly and that's what uses C everyday. They also start up much faster because there's less CO2 load. So whenever I talk to people about this and say okay, you have to carry you have to make them smaller and answer I get really really often is yeah,
but you know how we will fix this right devices are becoming Foster. There's more storage connectivity is better that might be true, but it's not a solution. I have to cry for you. So that shows the average size of installed apps on people's devices. I guess you can see this has been growing steadily send the early days of Android. We started it about a megabyte per APK and by now we are the whopping 32 megabytes on average for installed applications. So clearly hot. Where is not going to fix this we have to do something about it. We have to make it smaller.
But let's look at this. Where does this glucose come from? Why are Abyssal big idea? I thought I bring you a little example that. So I got this app, which I called Simple weather and distress really is on simple. Because of all this app does it take some statically predefined weather data and then renders it as a dynamic graph? Dynamic graph means if you turn the app, it will render the different size of the install it on a different device. It will adaptive resolution. So the graph is truly Dynamic the data isn't really really simple are two things that I found surprising when I built this
app. The positive thing was it took me over 30 minutes. I just went on the internet to look for a graph Library. I typed in about a hundred lines of code, and that was my little simple weather app. But it was also a negative surprise and that was the size. I thought I did something simple and small because that makes its mole, but that's not show this at what's 2 megabytes of a 90k. Whomever device just surrendering a graph for Megabyte block of code because it was uncompressed. And it got you might say okay for
megabytes. That's not really much do devices. Lots of space. Let me put that in perspective for you. If you don't back into the sixties, we flew to the moon with 60k of why does a 60th of the application size I had for my simple weather here and no matter how you turn. There's it's strictly more complicated to fly to the moon then rendering a graph. So what has happened here? Why do we go from 60k flying to the Moon to 4 megabytes rendering a graph and I think the reason is that we fundamentally changed How We Do software so back then when did it to suffer for Apollo?
It was a dedicated team that brought this code line by line. Everything was on purpose. Everything was handcrafted. It was really meticulously crafted to fit into the 60k into exactly one thing fly to the moon. Fast forward to today how build I am I at I just components. I just went to the web. Golden Sun components digital together. It said I had my app. That's a great advantage of this, right? It took me only 30 minutes. It was really easy. I was super productive, but it's also a big big drawback. My application was really big. Let's look into the details.
Which components did I actually use? For that I brought you my builder-grade old file and this is essentially the default file that Android Studio will generate for you and all I've done I've had two components. I pilot get them here. It's easier for you to see the first one is guava guava that Google, components for software engineering jobs, and they had a lot of convenient classes. The thing I wanted was a beautiful collections. It's so it should be in a collection. The other thing I use is Android plot
and it's just a library. I found on the internet. That's probably lots more of closing libraries, but I spotted this one the first And I propose really great. It has lots of support for bar graphs and all kinds of charts, but I won't lie needed a line graph. That's all I cared about. But it look what this means for size. How did these components impact the size of my application? I took out an idea. I went to the APK analyzer know if you haven't seen this before it's regrade. It's part of Android Studio, you will find documentation and developers android.com. But what is it
gives you a deep insight into what contributions to the size of the IDK kinds of things, but I'm only interested here in the actual code. South Highlands at the bottom there for you. There's just come Google package which country was 1.4 megabytes. So I'm paying 1.4 megabytes for immutable collection. This might be a very extreme example, but there's something that is more realistic there as well, which is Android plot. That's the second thing is he and that's 180 K takes 180 k n a p k size.
I know there will be something in there that I don't need because my actual application. That's the EU part down. There is only 35k. That's the code I wrote you might think 35k 400 lines of code. That's a bit much what it is because they sold to Auto generated code. Most of which I actually don't need. This brings us to the question. How do I get from this way? You can clearly see that guava an Android. Take the majority of my APK size application is really small does something that's more tailored. That's more like the thing that they did for Apollo. Of
course, he won't get you something as crafted as bad without doing all the investment but they must be some kind of middle ground and that really is where tool like proguard NRA comes in because one of the things that does it takes your application and it removes all the unused components So the goal is to take a component of ice field that he was made into something as tailored as possible and to remove old dead code. No one asked you you all said you were a used this so you will know it's really easy to enable because all you have to do is you have to head back to your
Builder grade of file and then flip this one flag right angles to generate this for you. Then they've always say magnifying table Falls. You just looked at the truth and that you have a small app. I see some people are shaking their heads already here because of course that's not really the truth. So I did this for my application. I flipped the flag and that was more than 200 Bill time eras. hooray Okay, what are they trying to tell me? This class is missing classes. I haven't even heard of.
What do I do? How do I fix this? Well the first solution you search on the internet, so I did and I found this great piece of advice which says just put don't want Star into your product configuration and everything will be fine. Now technically, this is correct you put this in there and it will compile. But that's a problem what this tells our rate is no matter what happens. Don't tell me about it. I don't mean to go Mazdaspeed 9rs, but it was reading mosque all the errors you actually care about where something went
wrong. How can we improve on this and to do that? We have to actually deep dive a bit into how are it works? Now I will talk about 80 are because that's the two I helped build but most of this also applies to proguard because it supposed to same problem. So what does array do two things first lady of magnification and meditation is the process where you take very long class names and replace them by very short last names instead. Some older coldest application but it really doesn't open schedule code. It just makes it a little less hard to read
but it's nowhere safe for reverse engineering minification, right? That's one thing but I don't want to focus too much on that. The other thing it does is shrinking not shrinking. It's a great name for this from like a developer because it's crap and it shrinks it into a smaller app. If you actually want to understand what the what happens under the hood, it was actually the other way wrong because what we're doing is we doing tree growing which takes the entry point of your application and grows that until we've seen everything that will be executed at one time.
Let's look at the example here. So I tried this graphics and it's absent a box as a class. That's all you have to care about for now. That's all these boxes, but don't read it just yet. Another thing on these crops. Is that everything on your right is Library classes? That's part of the Android system and everything on the left is your application code. And the first thing to realize here is that Library classes. I always life. And that's a very practical reason for it is because we can shrink them away. Anyway, they're on the phone. They're part of the system.
But it's also a technical reason because for a static analysis tool we don't know what these Library classes actually do. And that is because the wrong time I call into them at any point. There's very many different libraries like different Android versions and they also might change in the future the from a net assistant point. We just have to assume that the library classes always life. Let's assume we will start our Rap by calling the blank method in The X-Files. So the first thing we will have to do to actually do this. Its we have to instantiate this app class,
which means we create an instance of the class app, and we also called a Constructor. And at least two both of them being life, that means we cannot be kind of Stripper by the contractor so far so good. Next week after she look at the code of a Constructor to see what it will do with run time. And if you look at the code, you will see it actually creates a new instance of Class A. So again, this makes noise a become a life. We can no longer remove it actually have a Constructor. So there's no code to look at it only has a default Constructor that does
nothing interesting here. The other thing that the construction of the app class does it right the created instance to this field other field? It's interesting to note here that doesn't actually make other feel life because writing to a field is not observable. You have to actually read the field to know that the field exists. So now we've traded again since of this app class with executed the contractor. You want to call the wrong method method method is life. I would like with every other life method. We now have to look at it and see what the acote actually does it run time.
Take a look at the wrong. You will see it first reads the field after field to retrieve the instance. And this is the moment Weathersfield actually becomes life. Inexorable call a method on the retrieved instance and that is where a method in class 80 comes life. And now we would look at the code of a method see what that does it Michael those effects. So this is how the basic analysis flow works whenever you think about keep rules you have to keep in mind. This is what you analysis engine was do. Now, how does this relate to Android? How do we actually
know the entry point of an Android application? Well, that sounds phobia manifest file. So this is a very simply five minutes has filed that one thing it does. It tells you that the activity has the pause the graph as its implementation. Now neither. All right, no programmed to understand manifest files and they shouldn't because there's a little tool that actually helps us understand them and that's called a APT. Apt during a build with recess pre-process all your resources and nitrates cause pain and keep rules for you. So let's
look at these people and this is the first people so I will talk about about it in detail. So what did I see. So says keep that the simplest form. It says it's okay everything I mentioned though. You have to keep don't touch. It. Don't drink it. Don't rename it and want to keep a class. And then we have a fully qualified to ask him which is my 9th class that came from this manifest. There's something more here or just the class name would only keep the class. It wouldn't actually tell the system that this class is so since then she ate it which is a big difference. To tell us about that. We
also have to keep the constructors and that was the Synod line does so that's in it with the Elida paramotors tells out rate. We also need the contractors and we will instantiate does it run time? So now all right those discusses life and as I said before it will look at the code. Let's go there. This is my thoughts and I removed all the function body. But what you will see here is it actually doesn't have a Constructor? Let me see our analysis answer right here. We've seen the class. There's no Constructor nothing to do. Never
wanted you to who's ever written an Android app knows the actual meat happens in the on create method. That's where the actual configuration happens. So how does a red know the keeper will never told it to actually look at on create? What a tricky or is that the graph extends this empty compact activity class? And if you keep on following this you will see that eventually that extends activity. No activity is a library class, Mississippi for all Library classes. I always life. Hence, the song create
method is also life. But if the sun created the library classes life always overrides in life subclasses also become life. And that's another thing of the analysis. You have to keep in mind if you want to understand how it actually works. So this is what marks on create life. Let's take a look. This is my own create message. And it looks like a standard on create message first. Some set up dedicated to the superclass this finds you by ID. Disco Stu Library method and what it does
it run time dynamically return to an object somewhere from your view based on your layout. Again out rate cannot really understand this. Because this ID is again defiant in an XML file somewhere. This is my layout. I didn't see it says this ex-wife what Clause is in my contract lawyer and it has this ID Plus. How do we take out an outrage again apt comes to the rescue and escape room? I look for similar, but what it tells. All right that your layout uses this ex-wife photos and it will eventually I did at one time.
It's always the same principle. Another thing. I want to highlight. Is that during this at 12 serious? Where will you set up the plots? We used another identifier, which is this article XML. Nardil exynos different these are free. If the nail files you can just put it in your XML directory in your resources. There's no requirements on their contents. So apt. Can you understand them? So what use these are xnl somewhere you are responsible for all the keep rules that may require. Just keep this in mind.
We'll come back to that later. So this is the basic analysis flow. This is how this basically works now, you might ask, okay. 200 error messages. How does that relate? What went wrong there? I mean this analysis looks reasonable. Why does it fail and the reasons is that the analysis that we do is different from what the DM does? I want differences annotations. Now the Android VM doesn't really care about annotations at all. They have no meaning at runtime unless you use reflection. Set sanitation class is missing the Android VM will still just
executed code because it doesn't even look at them. In comparison out 8 has to understand a notation classes because they might be part of an April. It's alright has to find these sauces and has to understand the hierarchy and if I write and it will warn you about it. So that's a very common source about this morning's it's missing a notation classes. The other thing is code. All right, just kinda understand he has an example which is class value. So toss what is a concept from java 7 but it's not available on the Android platform. So this code will actually failed run time
executed. It will tell you this class is missing. Why does it still work because the creators of guava useless Nifty trick here to hide the missing class? What they do is they load to classify a reflection and if that fails they pulled back to some alternative implementation. Now the Android the angle understand this it run time to go to throw an exception. The exception gets caught and the alternative is executed, but I cannot understand this code. It's just too complicated to another attic light. And to fix these Heroes you just really have to look at
all these examples and find where they came from and also the lady that go distills to these five people's or warning rules that you have to add. Not the first three, they just disable warnings about certain and rotations from the check-in framework and are proud. And those are just static analysis to frames that are not use the runtime it all. The bottom two are two classes that are not readily available on the Android platform and they are typically used by a some reflective wrappers to make this work at one time. Citing these five rules. We get our application to compile.
That wasn't bad. Right? So we looked a better day together five rules, and we have a small application. Unfortunately, the runtime behavior of my application has changed just ever-so-slightly because not a precious. And that's the other problem you typically see. Again, what happened to your right? I'll explain to you how they analysis works that look to find reasonable. The typical problem is reflection. Because it's in your nature reflection is about using a dynamic run time value to load a method or class.
And static analysis just can't understand this Dynamic values of the enemy of static analysis. So, how do we fix this? Well, we have to somehow figure out how to tell our aid about these cases of reflection and make our eight understand them and that's really what people's do. So, what did I do? I went to the internet. Unfortunately the developers of Android plug put up this rule on the internet but says keep class. Okay going to keep some classes come and Report * * What this means is that tells all right to essentially
not touch Android lot at all. Again, this will fix the problem. It will rain again, but it will no longer shrink. For this cannot be the point in using a rate. So I can be improved on this and there's really no clever way doing this other than going on some forensics investigation. We really have to find out where all this reflection is happening and what we have to add to a d configuration to make alright understand it. So, where do you look where we find evidence in the first place to look is the ATV log? So I
process this a bit here so that it's easier to read but a Sin City go in Android studio and use the lucky or you can filter by process ID and this is what you will see. so they will be this lock statement saying that's diable definition not found for a As such this is not really helpful because I don't understand what this is trying to tell me but it is really great because it gives me a place to look and the source code. So this is a big piece of advice. If you write these libraries and you do reflection for the logging statements. It's not important that people actually understand
the message. It's much more important. The people will find where the statement was generated. Because I have this logging statement there. I can now actually look at the code and see what it's doing. So he is plugged Java and if they can see down there, it says locked at the start of the definition not found. So what is going wrong here? As you can see this test reflection, there's this title of a class that get filled and style of a class is a class object and get field will get a field from that class. Give me the names title name. So this is an example of
reflection on a class. Hostile about name Define because that's clear is going wrong. It's trying to find a filled Kool-Aid that seems so strange field name. This is house title name is defined. Is defined by means of get class. Get name? So we have to understand what this does what does get clothes to get together reflective invocation? They will return what's cold in Java the current class. Now if you're in this plot file, I didn't deploy class. The current laws can be the plot test itself, but it can also be any of its stock classes because at one time this method might
run a different context depending on how the virtual dispatch work. and now we get the name of this class and this seems to return a why did they name the class A? Well, they didn't the problem is minification. Right? All right went ahead. Saw this class and thoughtful plot is long name this holiday. So to fix this we have to prevent the right from renaming this class or any of its subclasses. And this is the corresponding Hebrew. So guess what does it say? It says keep calm and read... That
means all right should keep the class not rename. It not optimized it. But as I said, we have to keep all the subclasses so we have to say keep class star extends, Android. This will keep all subclasses of plot and the floor closet self. What is this actually what we want? If you think I can get class does a dead run time Returns the class of the current object. Now you can only be the current class. If you've actually been instantiation ready for class never gets created. There's no way of getting it by I
guess class. Sorry, keep roads. Do not actually have to keep extra classes. All we want is want to prevent them from being renamed and that's been modifiers come in. So he has a modifier for you, but it's not does it still says keep but it says allow shrinking. So tell her 8 if you see this clock class, you allowed to remove it if nobody uses it, but if you keep it don't rename it and don't optimize it. And this will fix our problem because now the poor class at runtime will actually still be cold plot as well the other subclasses. So it was not too
bad. You look a bit of the code you come up with to keep rules and the application will run. or not Set still not working. We have to do more forensics work. What do we do we go back to the ATV log and the message has changed. We now see a different exception again. It's not really clear what this is trying to tell me but I have an exception I can look for so this is error while parsing key lime paint with progress. Okay. Why does this happen? How do we find out we look at the code and he has the corresponding message
and reflect the views in there for you. So you can see we take a class and maybe get oldest methods. Annexed We compare the name of this method against some given name you're looking for. So there's two things that can go from here. We're just getting a set of whole message. So we might have removed too many methods and the name of this message. So we might have renamed them. So these are the two air conditions going to have to check whatever you moving and whatever you naming that we should not have to admit. It's kind of hard to figure out what this code really does
unless you have the library develop her. So the person who wrote this code initially knows perfectly clearly what this is doing and at that point it would have been really easy to write these controls. So what does this do? Do you remember these? So when we do this at plus series V configure, what is Siri is a supposed to look like and what has this really great feature where you can tell it to configure your graph based on some XML file. And this is what does XML file looks like. Alyssa can see in there. You will find this line paint that's broke with.
And what this Library will do it will take this XML file. It goes to all the attributes in there and then it got cold correspondent Getters and Setters on an object its trying to configure. So go take a graph object and then it would call the Deathstroke get line paint get her and then set this truck with property. This is a very standard pattern of considering something and run time and it's a really great feature. But if I trade sees this, it can't understand this because I write can't make the connection between this XML file and the actual classes. Also, it's this is free-form
XML. We don't have a apt to help us instead. We have to do this ourselves. And this is the corresponding people. What do we need to do? We don't want to keep any extra classes because it can't be trying to do here is be trying to take a note ticket run time that already exists and then we tried to configure it by calling get hers and setters. So that's what we use keep class members. That doesn't keep any extra classes, but it tells all right, if you're already keeping a class in the commander at Fort package also keep these members and don't be named. I want members if we want to
keep first of all, we want to keep Gators and what do gachas look like they return some result. That's the three stars that start with get and they typically have no arguments. So that first line in there will keep your guest has similarly we can keep Skechers so they don't return anything, but they start with set and they take a single argument of some type. This rule will I'll keep the geckos in sweaters. So they'll run time. The consecration can just happen. So one more people do you think it will work now? Let's take a look.
And yes, we made it. So we added a couple of peoples and now our application is actually running again, but was it worth it? Because this was a bit of an investment that we have to look at the coat where to take out what it actually does. What's this journey worthwhile. Let's go back to the APK analyzer. I hear can see the results. So if you look at, Google, which is guava that went from 1.3 megabytes to just 8K. I know that's very extreme because I'm using essentially a couple of few classes in this huge collection, but also for Android. Which is made
more realistic if you can see that it went from 180 K to about a hundred K. And it's more than one entire Apollo this Also, if you look at my ad you can see that it went from 35 K to 2 K and to case a lot closer to the hundred lines of code. I actually road because we moved all the unneeded or degenerated part that the bill system has created for us. I got traded this graphic for you to make this a bit more visual disappeared and my app also turned into this little sliver. So what's
a bit of a journey, but it really really paid off. So, what's the takeaway lessons here? I hope I was able to convince you that it actually makes sense to look into size. But if you build an app, no matter what your target audience is, please invest into size. Please invest into Creighton keep rules, please use program. All right, Traci shrink your app. But also I hope I've shown you some ideas of how to make this easier. And the first thing to really take away here is you should consider size early on. Because while you're writing your code, it is really really easy
to other about your people's because you still understand what that code is actually do it. Right on the code examples. We looked at if you had just written them. It would be easy to understand why this goes wrong. Also, you should add structure to your code to each describing reflective used. You remember the schedule since Alice example ahead where I said? Okay old classes in Android slot keep the Gators and Setters. These kind of things are much simpler. If you actually have some kind of interface that allows you to type these up. So if you had an interface say run time configure
object, you could just say and I keep roll every object that extends runtime configure it keep these Getters and setters. An independent of the actual application that it can just be part of this Library. And lastly this sounds obvious, but it's really important. You should continuously test and optimize built. Again, the girly are you find regressions and you'll build the easier it is to fix them because you will still remember what he actually changed and they will make it easy to come up with people's. If you are a library developer, you should really
really carefully provide people's because there's this multiplayer if you make precise keep rolls, all the uses of the library will benefit from it and a lot of apps will become smaller. Don't make this an afterthought investment to keep rolls while you build your library while you design your library people can find them because that's typically what we all do. We will search the internet for us to use the homepage put them in the file. Make it visible. And lastly consider using consulate program files when you're shipping via the IR system
because this makes it completely transparent to your library users when they enable proguard they will automatically get your peoples. Leslie please give us feedback. So be filled out, right? We've tested it. We believe it's a drop-in replacement. But only you can actually find that out. So if you're using proguard today, if I write a try tell us how it work for you if you're not using Pro Guard tryout, all right and see how far you can get with shrinking and how good are Diagnostics are really care about your feedback and lastly after the stork. He can also see
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.