I am the CTO at CultureHQ. Our startup is on a journey to create meaningful connections between employees and provide companies with new analytical insights that will transform the workplace for the better.Before joining the CultureHQ team, I spent five years working at start-ups in the Boston area. During that time I gained a broad knowledge of multiple engineering disciplines from industry leaders.View the profile
About the talk
RailsConf 2019 - Pre-evaluation in Ruby by Kevin Deisz
Ruby is historically difficult to optimize due to features that improve flexibility and productivity at the cost of performance. Techniques like Ruby's new JIT compiler and deoptimization code help, but still are limited by techniques like monkey-patching and binding inspection.
Pre-evaluation is another optimization technique that works based on user-defined contracts and assumptions. Users can opt in to optimizations by limiting their use of Ruby's features and thereby allowing further compiler work.
In this talk we'll look at how pre-evaluation works, and what benefits it enables.
Okay. Okay. I think we can get started. So high, thank you for coming to my TED Talk. This is pre evaluation in Ruby. My name is Kevin dice. I'm at Katie dice on the internet. You can follow me on Twitter. Please call me on Twitter. I give terrible hot takes on new Ruby stuff. That is a small person three-person company in Boston were focused on improving culture in the workplace. If you're interested in improving the coach of your workplace. You should come talk to me. I'm so first woman to start off with a warning.
This is a very technical talk, but that is not meant to scare you. The the function of me here is to relay this information without completely blowing anything out of the water or alienating anyone. So if you are an absolute beginner, I hope that you will find Value in this talk regardless of the technical content. So if you know, this is difficult stuff, so it should be somewhat difficult to understand but if you labor without any value at the end, then I've officially failed. So if you have any questions again, come talk to me afterward the first
we're going to talk about compilers. Ruby specifically Ruby's compiler and the various steps that it takes and then we're going to talk about extending that compilation process and a value that we can derive from that extension. So when we talk about compilers, we typically we talked about a couple of steps first. There's the lexical analysis step then semantic analysis instruction generation for the virtual machine. That was new starting in 1.9 various optimization passes, and we're going to talk a lot about that. So we're going to talk about lexical analysis.
What we mean is taking a sentence like this sentence math is nice. So we are nice. I know they're supposed to be an aunt in their butt ate fits nicer on a slide still anyway this kind of sentence and breaking it up into individual tokens that we can then apply a grammar. So if we're going to look at this, we have to split it up here. We have our individual tokens. This is just flooding and white space for now and we need to understand that we need to understand what parts of speech these things are. Right. This is still the
process. The other programming languages will go through but for our purposes were writing an English language compiler is what your brain is doing in your head. So, okay, we have nouns. We understand that conceptually is a noun is is the to be verb nice is an adjective so as conjunction and so on and so forth. We have a. And it out. So these are are tokens. Okay, great. So we've gotten this far and this is a little bit like Reggie Jackson and semantically we understand it like this is kind of what we're looking at or breaking things up into
patterns. Medical analysis is the process of taking those kinds of tokens applying a grammar and driving some kind of semantics meaning from those individual tokens the verb in the adjectives. We need a name for this section. We need new names or going to call it a verb phrase in the pattern is can be verb adjective. And Sophie go. Okay. Now we have a verb phrase. We have these two little trees and you can say okay. That's a verb phrase. That's a verb phrase. Now we go when we add Matt's and we we have known since we
need a name for this kind of thing as well. And so we can go into our grammar and extend it and say okay, let's have a noun and then a verb phrase has been closed subject phrase. Now. Here's a little disclaimer. I don't actually remember very much about grammar at all. I have no idea. What is a subject object and subject and object. I don't know any of those things but this is we're going to call it for this part of the so this is a subject phrase great. We got to subject phrases. So then we had a sew in the middle. I'd look this one up turns out that is called a subordinating
conjunction. Also someone came up to me after my last time I gave this talk and told me that's actually an adverb. I don't care. You're still going to get this. I just don't care that's not the purpose of this talk. So thank you very much. And so we added me have a subordinating conjunction Grace. Okay. The last token we have is a. We're going to say okay for the purposes of our grammar and sentence is always it's a subordinating conjunction say that three times and then. Great we have a sentence and if you look at this, this is a tree it's relatively abstract.
And it represents syntax. So we're going to just off the top of my head call that an abstract syntax tree. Amazing Race, we have a nasty we have derived this from plain text. We've gotten tokens. We've applied our grammar. And you see that this is very similar to what is already being done in a lot of places in the world in our apps in our code. This is an example from Brock which is a recursive descent something something parser thing that repeat that Ruby uses and this is an example of a calculator right? We have
various Expressions on the left side Inn Express on the right side of an expression Eva Plus in the middle of that create another expression reality values together. This kind of grammar is also used inside of Ruby and parse. Why this is what is used to generate the parser that generates the semantic analysis for us to understand a ruby programs is also the part that does the arithmetic. Super 8 so we've we've used our grammar to generate an abstract syntax tree me to walk that tree and generate instructions for the virtual machine. So we have our tree. And what we need to do
is all of those blue nodes all those blue knows that that that aren't the light Booth the dark live nodes all those dark blue nose represent something that we're doing. We are we're doing something within our virtual machine to manipulate the state. Sophie look at just the bottom just a verb phrases. What we're really doing is pushing an attribute. This is all going to be very not scientific doesn't actually make any sense. But it's okay. We're pushing an attribute that dollar signs who is meant to represent the second item of that little
pattern or pushing Ashby on to the stack Freight, okay? 4-Hour subject phrases were going to say okay. We're going to season a tribute on to that noun but put poppy attribute that we just pushed onto the stack. Okay. So we're staying for the verb adjective is nice that we have nice as an adjectives and we know we need to apply that to something when you get up to the subject phrase for when pop that off and apply it to the nouns we have math nice mats is nice great. Okay. Now we have our subordinating conjunction is love saying that word. This is basically
this statement right if you if you really think about that that's this is an if statement this is saying the right side is only true if the left side is true. Great. So if statement send in a virtual machine instructions offering are represented by jumps, right you he put a label somewhere you jump you skip over some instructions if something doesn't resolve the true, they're going to stay working to conditionally. Skip a couple of steps great. Finally. We have our. And we're just going to say okay, we're going to trace the execution of that
sentence and we're good to go. We have generated our machine instructions and we're going to do that just by walking the tree. Okay, we start to eliminate things from the tree. This is all very not scientific at all. Totally fine. And we're going to we're going to pretend that this is like pseudocode for a virtual machine that you understand what I'm writing a compiler for English. This is not real but but that's okay cuz we have our instructions great. Let's optimize those instructions. That you might say that there are no optimizations that are available here. Technically. Our
grammar could support us saying Matt is not nice. Nokia's is a nice guy was really saying is this is a constant part of of the virtual machines to the concert part of the instead of instructions. Although yes, technically the grammar could support us saying that says not nice we know he is and so we're just going to eliminate that and move it up great. We've eliminated dead code or unnecessary checks or whatever it is that you want to call it and we have fewer lines to execute. Our code is
more efficient. Everyone is happy. We are nice. That is what remains after you take away the left side of that statement. All right, great. So, let's look at Ruby. Let's look at Ruby. Let's look at what happened? Okay, if 5 + 3 is 8 then puts hello world. You would think this is always true. And for the most part you are correct. If we go me, look at the instructions that are generated for this. We see a whole bunch of stuff that looks look at what this is doing. This is putting 5 on the stack and it's putting 3 onto the stack Bennett.
You know adding them together and that's great. I can tell you right now that will always be eight. And also tell you I'm lying. So if you're not too familiar 5 + 3 in Ruby breaks down to five send the method plus with whatever is on the right side, so What's it called at 8 for now? If we go further along can also see Optical. Okay, 8 is always 8 or so. Let's just put true. If you look at the code, that's what this breaks down too. And so if true Wheatley can always just get rid of that. It's
entirely just to tell world. Except when someone tweets something like this. And then someone writes this. And now we're evolving stuff off of Twitter. So, you know, we really gone to the dark side. But you know now what happens when you put five foot three you get to all because I decided to tweet. the thing is we have to go back to work out, of course because the grammar for writing that message so okay, so I guess we'll go back here. Sing. The optimization the fancy word of way of saying Okay. Well, technically
this is this value, but if someone does something stupid then we need to go take care of it again. damn If we haven't we have that instructor, but I don't want that instruction in there. That's a problem. So the real question I want people be asking is why would you ever do this? Dustin don't like you. Okay, so raise your hand raise your hand if you have ever purposefully. Monkey patch in arithmetic operator on a core class in a production application. Yeah, I thought this thing is we are dumbing-down
R compiler. We are as a teeny dumbing-down R compiler to support this use case. We are not making this optimization because technically someone could be stupid. and do that fiber Jam Call Pre Val. In Prairieville is a little gem that assumes you're not stupid. So, okay. So what we're going to do we're going to go see that same process wherein we drive semantic meaning from a string in Ruby and then we're going to do a couple more things before we send it back to Ruby. So Ripper is a big tin library to Ruby at ship's you all have it. You
can require a ripper and Ripper will give you the abstract syntax tree. Just great first two steps are taken care of for us. We get this. It's a bunch of a raise. We're going to do stuff to a raise. That's what we're doing today. Sophie go into Prevail Source. This is kind of a dumb down version. But basically we're going to do is we're going to Loop over all of the different types of nodes and we're going to build our own nodes that we can then manipulate we descend from sex Builder s Expression Builder has some basis in computer science. I don't know these things
and then we Loop over each of the events. We Define a method that will handle them and we pray a whole bunch of notes and what we need is a way to go from abstract syntax tree. We have a way to go from source to have sex abstract syntax tree back to source. Any way to go back to source so we have this to Source method and we Define this module format, which is just a little bit of code. Basically takes every single possible genotype and Ruby and converts it back. This
is not this is not a new this has been done before it was done the gem sorcerer, which I sound like literally a couple hours ago and then but this idea of transforming abstract syntax trees. He's not new right people have been doing this back into source for a long time for for matters Auto for matters specifically if you look prettier and this is what I tell you that I'm incredibly bias cuz I wrote the movie plugin for prettier, but it does the same thing. It takes an abstract syntax
tree converts it into an intermediate representation and then print it out. The thing is this is not pretty Ruby. This is uglier Ruby. If you look at this, it's horrendous. It's actually really fun. Basically, everything is past Rockets, which makes me cry and you know, there's parentheses all over the place in the spaces. You never want to look at it, but it doesn't matter because I know she will never actually Steve the fart. now in order to really go into this process we're going to do is when parsed that Source parser is just
as they should you really just the stems from Ripper. And what we're going to do is just Loop through all of the different visitors that we builds to go and manipulate that abstract syntax tree before we convert it back to source. And so all we have as a public API is Prevail. Process where does processing a string in again? We're just going through the same process that we did right at the beginning of taking a string of whatever. Applying wax to analysis to get are tokens applying Samantha analysis to get our tree manipulating the tree through optimizations and passing it back to Ruby.
So if you're within the know this is what we're really doing were visiting each of those nodes and we can build a visitor that does something like this as simple as this if the left side is an integer and the right side is an integer we can just go ahead and do that. We're going to assume we're going to operate under the assumption that people aren't when your monkey patching things. They shouldn't be monkey patching anyway. I did actually have someone write their hands like two times. They only gave this and I was like, why would you do that? But like, I don't know it was sad
and they're just ashamed and you know what? I'm okay shaming those people. I really am. I just don't mind so much. Okay, so when you do a quick demo there is a small rack application built into the gym. Look and see it in everything. So if we just have something like a quiz 1 just print out a close one. Just great. What's funny is it like even if you Do this right indentation is totally not understood. It doesn't matter again. No one's Lexi looking at this thing is technically if you have an outer reader.
Which was a method that is named after the instance variable that is returning. It is more efficient to call at a reader than it is to explicitly to find this method so It's just going to do that. It's just going to do that automatically now technically technically someone could have monkey patch the outer reader method. I don't care about those people. I just don't. Male technically if you have a value here, you can see it washed. The outer ear cuz was in and out of here anymore and you stood equal to Value to then it's going to
maintain it. But he said he wanted to take a pin out of writer on the left side is on. The left side is the code that you are writing on the right side of the code that you're Ruby seeds and this is more efficient. This is better and this is doing it without a senior Dev yelling at you in a PR. Even better there certain other things you can do like while true. Technically Loop do with more efficient. I don't actually know why I don't actually care and wall. Just goes away.
Nothing's ever going to get executed. So doesn't matter you can do the same thing with until. And until true wrapping your head around double- just hurts me and obviously there are plenty of other things baked in. The real magic of this gym is how you use it. Only one part of the public API that really does the magic of process for y'all the process but if you happen to use in Boots nap with you all are using if you boot that, you've run Trails nuisance rails 5 or
if you just decided to use food snap snap is a gym is it actually will look at the code that you're going to be executing and go ahead and generate that those machine instructions and write them out to a file to speed up your route time. Well, you can do terrible things to boost app. You can monkey patch their input to storage method and run stuff through checking. If. Is available and just instead of just compiling you can just process the stuff before it actually does anything. All this to say is you're lying to boots Napa boots. That doesn't
care. What is Method privacy? Anyway, so little badly, so I did add one thing. Every one of these visitors you have to opt into technically. If you start running Prevail on your production application today, then it will do absolutely nothing you need to go and explicitly tell it to do option to all these different things. And the reason you have to do this is because technically you could be doing something dumb. I lied earlier. I will support that as dumb as it is at but if you go and enable all these things
and you are opting into that. That volatility. It's okay. So this is the current list of what people dies. You don't actually need to be able to read all that. But basically it will take care of it arithmetic expressions or ethnic identities. It will take care of the other accessories has some various things from the faster ER Jam which is a stock analysis tool that will tell you when certain things are faster to put on Ruby version and also do a couple things with loops. Did, you know, there's a for
Loop in Ruby? Technically there is don't use it. Right. So really these oppositions break down into three different categories. There are optimizations that can be done now and in my opinion, they're very safe. No key patio door toaster method if you do not help you but they're in instruction elimination. I don't really see a way to make while false ever asked you anything. Short of a c extension. I just don't see how that's possible. So these are things that could
be done today Prevail contains those kinds of dolphins that can be done with the optimization. If you look at them jet there are the Ruby just-in-time compiler Cayman 326. There are the optimization techniques being used and so really is doing some of these things a full hip replacement could be done with the optimization. We're in it just gets replaced with. Eat besides cat monkey patch. Then it falls Dr. For Loops constant folding as in 3 + 4 could be done with the optimization. They actually is done with the optimization with Ruby jet.
And then there are certain things that the compiler can never do that. We can operate under certain assumptions and then work with a compiler is never going to replace certain apis with newer faster apis shouldn't do that. That would be very scary. If a compiler did that because that is inherently unsafe can't possibly have enough contacts about your application to know whether not just safe to make that up from the station. But we as developers have the capacity to build optimizations like this for our own applications that speed up run time that
reduce memory and are good for applications that the compiler doesn't even have to see it doesn't have to worry about it. So this is a question. I cat but why why would you do this? In the obvious, that means obvious answer is performance right? But I'll tell you right now. I'll tell you right now. I put this into applicate into production only temporarily shut down my service and I did put into production cuz I wanted to go to Slaton production rates. It didn't improve performance at all. That's not the virtue of this gem
if we look at code Style. Code tile as in like the source of most arguments we start out with absolutely nothing code Just for Laughs rates than we have people getting angry at each other leaving work early cuz of Robocop. And then people like all we have a way we do Cody or our way is the best way so then we have seen your dubs enforcing code in code review. What's funny about this progression that I'm running through ice and a company that went and let me tell you just gets better as you cope senior does enforcing that's terrible Okay, the reason that's terrible senior guys have to read
every single one codigos out. That's awful. No, don't do that. I don't want to have to review code. I want to be able to mentor and program. That's all I want to do. Okay, so the senior does get together and decide a style guide is going to be developed. Okay, great. There's a ruby style guide. It exists. It's on the repo for RoboCop fine. We have a style guide. We don't have to agree the community. But the thing is if we just go by the Southside and eliminate a lot of arguments. Okay good. Okay style guide is in place. That's great, but it's still being enforced. The winters are
developed in the Run locally and hopefully in CI. That's all well and good. But, you know the amount of time I have to sit and wait for bundle exec RoboCop. The amount of time in my life it pains me to think about that. It hurts. I run it locally and it makes me sad and you know as Marie kondo it taught me. It doesn't spark joy and so the final step of this in my understanding of the South progression are Auto for matters and compilers. Fundamentally if I express an idea
in speech in English masses nice, so we are nice or if I say how y'all doing. How y'all doing, there is playing in they're technically according to Oxford English Dictionary. I have not spoken with correct English yet. Somehow some magical way all of your brain to comprehend what I am getting across fundamentally. It doesn't matter how I express myself. If a I haven't offended and B, you understand what I'm saying? 2019 are programming languages should be able to understand multiple
ways of speech and still come out on top. I don't want to have to sit and wait for a linter to tell me that you know, Loop do was very very tiny slightly more efficient than while true. I don't want to sit and wait for that. I don't care the compiler should be able to handle that for me. I shouldn't have to think about the way I express myself as long as my idea is getting across. We're here. We haven't gotten as a community down to the bottom step. There are other languages that do get further than us. Especially if you look at the plantations of
Ruby truffle Ruby has a lot of stuff based on the kinds of stuff. MRI we have not quite gotten there. And I can show you based on the Travis that white male in my repository. What is that? What that is the six different static analysis tool. That's horrendous and I wrote it. I chose this I chose this life why I hate it. I hate it. Really? Okay that one Dilaudid that should be taken care of. The Ruby language should be able to take care of that RoboCop. No, no, no, no, no te for
matter should take care of the style rules. The performance will should be taken care of a by compiler Ben rails rig same thing break man. The security vulnerabilities should be taken care of by different apis being chosen by the compiler or an auto Fab Four matter to change into the method entirely or Disney are being raised bed rails faster same thing. I just want to test my code and I just one play my code. Supreme look back at the style progression What's think of the world where this is the reality
Auto for matters and compilers did it take care of it? They just take care of that and we're all good at the end of the day what this means is that you were going to write your code. They're going to test your code and then you're going to deploy your code. You're not going to run 6 different static analysis tools. It doesn't fundamentally matter the way in which you express your code. First of all all the microbes optimization that we've been obsessing over as a community, you know, those don't matter in your app. I'm sorry, but they just don't
fundamentally most the time we're running a micro optimization. We actually care about it. It's within a framework. It's within rails. There are hot spots and rails that need to be micro optimized very very few applications actually mean that level of optimization more often than not you're still process is more broken than your for Loop. at the end of the day I just want to write Ruby code. I don't want to have to obsess about the multiple steps that I have to take in order to get my coat out and were to somehow avoid a conversation with
another developer because they like Prince. so So yeah, I Prevail is ready for you. So you can use it in production. I promise it won't take down your application. The nice thing about the approach is that there is zero. Run Time Performance blocked right? This is all done at compile time when rails is compiling your code. There is a step between the time that it generates the abstract syntax tree in the time when it's is executing that code that time is before startup. Your application isn't slow down by this isn't doing anything live in production.
This is just doing things locally. And with that Daddy's everything I got. Thank you very much. I had like three cups of coffee or went really fast turns out so if anyone has any questions, I have plenty of time. Right, right. Yeah. So the question was can the gym Valdez that it's Transformations are safe and it's funny. I was about halfway through writing that validation code when I realized the integer is monkey patch by rails. And I realize that won't work
turns out that rails monkey patches integer for Gates adding and stuff like that. She money to patch a lot of stuff. So I'm just going to go ahead and run without validation is for this but I'll accept a PR. Is there a way to reformat the the compile code back into Source? Yes, there is except. I hate that. The reason I hate that it's cuz the entire virtual of this Jam is Allow you to express redeem code multiple ways without being forced into one sense of the way of expressing it. So, yes, you totally can but honestly at that point I would just point YouTube
Revo cop dash dash fix or Auto format or whatever that documentation that you just define a class that has any on underscore in the no type and then you do whatever you like. You can really shoot yourself in the foot. Any other questions? Great, will I will be available for any questions afterward? Thank you very much for coming.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.