Events Add an event Speakers Talks Collections
 
RailsConf 2021
April 13, 2021, Online, USA
RailsConf 2021
Request Q&A
RailsConf 2021
From the conference
RailsConf 2021
Request Q&A
Video
Profiling to make your Rails app faster - Gannon McGibbon
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
5.05 K
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About the talk

As you grow your Rails app, is it starting to slow down? Let’s talk about how to identify slow code, speed it up, and verify positive change within your application. In this talk, you’ll learn about rack-mini-profiler, benchmark/ips, and performance optimization best practices.

About speaker

Gannon McGibbon
Developer at Shopify

I am a software developer with a love for open source projects, elegant code, and learning new things. I also have experience with computer hardware, maintenance, and networking. I am passionate about creating stable, legible, and reusable code to solve problems.

View the profile
Share

Hello, welcome to my talk for reals, 2021. My name is Gannon and this is profiling to make your rails that faster. Start with a little bit about myself. I'm a real commander and a big fan of the review open source community. I'm currently number 55 on the rails, contributors board, as a recording. I have 246 commits on Rails. I'm a cat is named as well as he's. The reason I'm saying, enough to get this talk and be socially distant times. You may hear him over the course of this

presentation. I work for a company called Shopify on the code Foundation steam. One of our main focus is to improve the developer experience on the top of a monolith. If you haven't heard of the monolith, it's a really really big rails app. Probably when the largest in the world. The origin of the talk dates, back to 2019 where I attended Rubik ID in Japan. Is my first conference in tons of fun to attend? I think he knows by Jeremy Evans, talked about all the optimization, see made

to the sequel and Rhoda jams. To be frank and blew my mind. I won't spoil it for you if you haven't seen it, but it taught me so much about code optimization after watching. I knew I wanted to learn more. I knew that year. I went to railsconf in Minneapolis and attended a talk by need for Quebec. On the basics of benchmarking and profiling. Another amazing talk, I previously known about profilers but didn't really understand how they worked. This talk, really put things into perspective

for me. Few months later. I was finally ready to show the world. What I had learned. I read an article on how to write fast code for the Shopify engineering blog. 8 minutes a hacker news. I got a lot of reads. The following year, I go out there to follow up article on how to fix low code with Jaelyn. Fast-forwarding to February of this year. My coworker Chris Salzburg started a project where he took a look at how slow the monolith was. He found the many areas that needed Improvement 2020 was a big year for Shopify. And a lot of Rush development has taken its toll. Nobody had paid

much attention to keeping things fast and turns out. You can reduce a lot of speed regressions over the course of a year. After seeing the success, Chris was having fixing performance problems. I thought to myself, what about every other rail. In the world? I'm sure many of them could use a deep dive on speed optimization. So I submitted to talk proposal about profiling the rails,, And here we are. The stock follows the story of a rails app. One that was built by a contractor for a small company. Taking a look at the app.

We've got views, the index and show products. You got a card view or we can add and remove products. And finally, we've got used to perform check-ups. Pretty simple, right? Just a quick side note. This is talk about profiling and not writing style sheets. I'm not very good at making websites book night. So please pretend it look professional. One day, the company hires a new developer to start working on the app and surprised, it's you. On your first day your boss

tells you there's a problem with the website. It's slow and you've got to fix it. So what do you do? Well, after some kind of Google you come up with some answers that look pretty reasonable. You got some indexes to your database that might help speed up queries. I went and bought one Cruiser bad, aren't they? We should probably have lots of those. And you're putting off upgrading to the next version of Rippy that wouldn't hurt to do. Why do my feeling uses a lot of JavaScript? Maybe we can use

more of that? The problem is authors of performance, optimizing articles. Don't know what's actually wrong with your app. Many of them have recommendations that are generally good, that you should be doing but might not fix your actual problem. This can lead to premature optimization and unnecessary complexity in your app. What we want is a tool to help narrow down performance problems not recommendations on rules best practices. Eventually, you come across a suggestion to fix your interest profiling that might be helpful.

But how can we apply it to our house? Well, if we look at grabs jump, while we'll see you recommend a profiler that seems like a good weed. Reading up on the gym. Will find it. It's Middle where the displays speed badge on HTML pages. You can do database, profiling, call stack profiling and memory profiling. Both of you unfamiliar with middle where it's essentially code that runs on the server. Before our request is eroded to your application code. It runs

between you and your app code. Reference real ships with a lot of Middle where by default. When it was that you're probably familiar with is the debug exceptions. Middleware. Did you know you're stuck in the middle where development stock prices to your browser? It's an exciting time to be learning about Rock when he profiler, if you're on real 613 or later, I'd recommend profilers included by default. But if not, you can always add it to your junk file. If we go to our app, once the page, loads, you'll find the previously mentioned speed

badge on the top, left-handed the page. It shows how long the page took to load and how many requests were made. I don't know what rack many profiler is. Let's talk about its features starting my database profiling. If we click on the speed, badge will get a breakdown of partials, rendered and sequel, Cruz executed. Working on SQL, queries will expand on the exact queries that were executed with stacked races and timings. You might find their overall concept familiar in essence. It's the same information that real server logs, get you. Is the log for the same request?

You'll notice it list, sequel, parades memory, allocation counts and render timings reviews. Speaking of memory allocation. Let's talk about rachne profilers. Next feature, memory profiling. This feature requires a jump to work. I need to add memory profiler to your junk file. With memory, profiler bundled. We can reboot are out and visit the product index using the query parameter p. P equals profile memory. We can get a report of how Ruby allocates memory forever.

The report is fairly plain looking, but very detailed. Every balls around to Concepts, allocated memory, and retain memory. We can see that Ruby. Allocates 2.8, MB of objects will building our view and retains a point to objects. Make a 5.2 MB of objects. Memory profiler can be used on its own to profile, arbitrary blocks of code. This is essentially what rack many profilers doing. Are you at the middle or level? Some of you may be wondering what the difference is between retained and allocated memory to put it simply allocated. Memory

is all the memory, your computer takes to perform an operation. This could be responding to web request running Baldwin stall or executing a method. Retain memory is memory that remains allocated after the operation is completed. Looks like an example. If we were to profile, a simple object creation. We would find that it allocates one object and retains. No objects. This is because the object lives within the context of the profiling blog and gets cleaned up afterward. Remember to sign the object

to a constant. We suddenly have retain memory because constants are Global there. Something that lives beyond the scope of the profiling blog. It is example, we can assume that retain memory is always equal to or less than allocate memory for any operation. Don't understand memory profiling. Let's talk about broccoli profile, his last feature call, psych profiling. Like memory profiling call psych profiling requires another gym. This time. It's the sack project. So, if we had stock dropped or junk file,

and use the query parameter, p, p equals swing graph. We should get a report detailing, the call Stacks Ruby ran through to respond to our request. When we do that were greeted by a graph with lots. Going on, before we get into what it all means. We should talk about where the data is coming from. Stock price of collect calls tax. You're probably most familiar with call Stacks from exceptions. When the exception is raised Ruby printer stock price, which is a summary of the call stack that lights of the air. Please call Starks are gathered by

observing running code and taking snapshots at scheduled intervals. Will use this data to paint a picture of what your program is doing. An important note to make is that stock dropping the templin profiler, which means it doesn't exhaustively, snapshot all call Stacks for a given operation. The sample rate could be tweaked to show more or less data. As for what stock problem measures will taking snapshots. This is where the different profiling modes come in. The three modes are wall, CPU and object.

What time is time is you and I know it. You can pick up a wall clock. What time is the default profiling mode that you want to use 90% of the time? You probably seen CPU time before and activity, monitor, or some other task management program. This is essentially means the time your computer spends, thinking about something. I'm not saying modems to solve the same problem as memory profiler accounting new applications. Typically you'll want to reach her memory. Profiler when measuring allegations because of more detailed.

Like my work profile are stock dropped going to use on its own to profile arbitrate blocks of code. Looks smaller this. You such a little bit more later. Now, it looks like about the graph we saw earlier. They're going to clean brass. It look like this. When grown-ups are standard way for viewing profiling data on the x-axis and y-axis. We've got called back. Below the preview window. There's a larger more interactive view of the graph. Rackety profiler generates flame grass with

feed scope. Pickup is a profile viewer, written, and text script. It supports a variety of formats from different profilers across different languages. However, there are a few ways. You can generate a point graph. If you're familiar with Arby spy profiles, collected with that tool, look like this. Armies by uses the original Pearl script made by the creator of plane drops. Brendan gregg But it's the same concept more less. Make all stocked up on the y-axis. Can also be inverted on some graphs. This is how a plastic flying graph, look.

But the look in the flame graph is dependent on the viewer that you're using. Find route to come in various shapes and sizes. A great feature of speed scope is the different ways. You can choose to do the data. You can talk in between time order left, heavy and sandwich mode. Time Warner is the standard you we've already seen time on the x-axis and stacked up on the Y. XIV is where things? Get interesting time is on the x-axis, but no longer in sequence similar to call stack, similar caltex our group, so you can easily

see combined timings for the slowest methods. For example, it's a little hard to see the garbage collection occurs multiple times in time order what is grouped into a single entry in left heavy. Sandwich starts with a sort of a list of caustic methods, you can sort by self time and total time. Self is how much time is spent on inside of a specific method. Where is total, time is the combined time spent in the specific method in any nested method, call.

If we click on a method in this view, we can see its position and total time as a flame graph. Give me a call. We can see action controller is routing to our controller and rendering of you. Well, it doesn't take very long to actually call or controller, the total time for processing and end takes the majority of the profile. Hanging out together we could see rahmani, profiler has a lot to offer. The speed badges for rendering summaries. Memory profiling for object, allocation counts. And call stock profiling, for call stock analysis.

They're even more features. You can access with a p p equals help for a parameter. Now, that's all great, but we haven't solved anything in there half yet. How do we know the basics of rock when your profile are in friends? Let's use it to solve some of our issues. After getting reports from your boss, that customers have been complaining about slow checkos. You have something to work off of What does your newfound profiling superpowers? You had to work? First off on the checkout

page. We're going to want to add the flavor of Cory or parameter. We can do this by they're injecting it into view or with the web inspector of our favorite browser. When we submit the form, we can see a flame graph. In the preview of the top, we can see a few interesting things. There are several long plateaus in the graph showing. We're spending a lot of time doing just a few things. If we switch the view to lift heavy, we can see that there are just two things that are taking the majority of our time. The first is capturing a check of payment. The

second is sending a confirmation email. Recent were using a payment Gateway and a remote mail server. We find ourselves in an interesting problem. Both of these issues stem from talking to remote servers often. We can't control all bottlenecks within our system. I can tell him that initiates these communications look like this. When the others created, we need to confirm it. Expensive or time-consuming operations. We can talk to my house like these we can use active job that

allows us to move this logic. Over to a job class. Order confirmation job encapsulated payment capture and mailing work. So we can treat it as a single entity. Dumped me worked on a synchronous Lane development or pushed to another worker process in production. I cannot control her, we can replace the previous code with a reference to our new job, telling rails to do it later in the background. Is there going to jobs in development work by default when production is best practice to use a queuing system? A good choice, would be to use a gem

like sidekick. We can spend up another profile on CR order. Confirmations are now being pushed to the background. This leaves are controller faster, and our users happier. For more information on jobs. The rails guide on actor job is helpful. After successfully speeding up, check out. So your boss has impressed Dimensions would be nice if we could load the products page faster. So you decide to investigate If we break up, rahmani, profiler again on the products and Ducks will see a lot of spikes. Each bike appears to be a product partial rendering.

easy answer to this would be the paginate, our records, you will inevitably encounter if you issue, you can't design your way out of If you looks like this. We looked through all of our products and render a partial for each one. Here we can use. Testing allows us to do the work once and Stones results for subsequent. Use with this context. We can render a collection of product parcels and cash them in one line. No, real doesn't, normally use cash drawers and development mode. So you'll need to use the dev cache

command in order to enable it. When you're done running death cash again, will toggle the future back off. So, after enable cashing in developer mode and profiling again, we can see that our cache fits drive down our response time from 600 Ms, to about 40 seconds. At the barn, 15 x Improvement. Even on a small rails up with simple news cashing can make a huge difference. The casting of the rather complex topic. I recommend Consulting the rails guide to see all of your options. A few

months goes by and you built up your own quite a lot. The production side is working great. And users aren't complaining, life is good. The one day you start to notice, your app is taking a long time to start up. And tests that were once fast or started to crawl. What could be going on? Well, you better understand the problem. We need to know how development mode is different from production mode. Let's Take a look at the different environments of rails. This is one level deep in her. After I carry.

This is where our autoload pads live. You can do some folders, is at 11 with the exception to folders that don't contain Ruby files. So assets, JavaScript and Views are ignored. And development, these pads research. Whenever your application code references and undefined constant rails, will try to find them. Load, a constant based on the files. I can see in these paths. Introduction. On the other hand. These pads are eager loaded. This means that your autoload pads are interested and required on boot. It slows down, application startup time in favor of speeding up request time

for our users. In test mode, we can assume roughly the same behavior is development mode, real swaddle of the exact same way. This means we want to do as little as possible, and development test. Mostly, because we don't know why the app is going to be booted. It could be to do a check out to run a model test or to open the rails console. For example. Directions to complete opposite. We want to do as much work as possible out front to optimize for our users and for our infrastructure. Typically are Apple only be voted to handle web request and perform jobs in production.

These two ideals are constantly at odds with each other. It makes it really difficult for developers to account for both while developing features. Like I mentioned earlier, you can, you stack for off on its own to profile? Any code with a little extra work? We can even instrument our profiles to be open with me to go. Now, this is a little bit of a hack, but this code will profile. Any code you place between it and open it up and speaks go. Exact process run method, start-and-stop can be used to profile.

These methods are helpful for code that doesn't fit neatly into a block. The Roth option in Jason generate help up for the form. At the speed, scope can understand. System color hair, show, low profile file to speed scope? So if we start the profile in our application file, we can get a pulse on what our app is doing a startup. After we require gems, and before our app class is defined as a good place to start. Then we can stop on the applications, fully initialized.

Using Waze to 1/2. We can capture the entire boot process in our profile. Since we aren't leveraging rahmani, profiler anymore, we need to use our own instances foodsco. Luckily. It can be downloaded via an old package. After adding speed script or app. We're ready to start profiling. We can start our server with the boot profile, environment variable. To get a profile to open up in a browser. It took something like this. If you look closely there, something called

Spring in our call sex. Most rails apps. Use the spring Jam to start off faster. Spring boot your server once and then keeps it running in the background for future runs. This unfortunately can skew our profile and results. Do the hidden nature spring? It can lead to a lot of confusing cenarios. All the people on the internet. This like this Gem and will tell you to remove it from your project. This however, is bad advice as your rails. App grows, spring will save you.

A lot of time in standard development blows. So please don't remove it. The bypass spring for profiling, we can use the disabled spring environment variable to get more accurate data. We spring disabled. If we do a boot time profile. Again, we can see spring is no longer in our call Stacks. Now that we're profiling boot, we noticed one day that are apps. Startup time has regressed multiple s. Something needs to be done. Profiling boot, we can see a severe regression related to reading Network buffers. If we look a little ways up the stack, we can see our shipping rates in a

socializer is the problem. Looks like something is hanging when downloading shipping race. Let's take a closer. Look at the module to find out more. Then Woodridge method, create a network lion and makes a request for shipping rates. This request is probably timing out. Our apnix know how much to charge for shipping but doesn't need to make this request on every boot. Probably not. There are few ways to handle this scenario. But an easy way I can think

of, would be to make a rake task. This will allow us to treat chicken, right? Downloads as an isolated workflow. We can optionally run the task on production deployments or manually and development. We may also want to increase the retirement for shipping rate downloads if we find the connection is constantly timing out. Free shipping rate, downloads asked. If I'd we can see, we shaved off nearly four seconds off our boot time. This is a huge performance when

Wait, there's more. But take a closer. Look at how shipping rates are downloaded. Depending on the size of the final, we could be riding a rather large string. This is a good excuse to try out memory. Profiler. We can wrap the content of our task like this. And see what the report says. If we rerun the shipping rights, ask we can see about 3.1 MB of allocated objects and 1.5 MB of retained objects. That's a little bit big. Instead, we can stream the content and build a shipping rates file line by line. Here we can open the file in the pens mode and stream the hdp request

gradually. It should hopefully cause less string allegations. Turn off. We're down about half the original allocations at roughly 1.5 megabytes allocated and only a few bites retained. Besides the file hasn't changed but the amount of times we built a representation of it and Emery has. As a side note, the new code is also noticeably faster. Well, debug in the last issue, you notice another issue. Initializer that was showing up in our profile. Why would that be? Take me another

boot time profile. We can see a sizable chunk of time taken up by the tax service. We can also see the code that triggered this load is the tax service initialising. Let's take a look at it. This is what the initializer looks like. The two prefer call back safely on Lowe's code after Boot, and when the app needs to reload after a change. We need to find a way to keep track of the value if we want to set up, but defer loading, the class until we actually need to use it. I should mention

rl6 reventada loading with the new gem called Zach work. Replace the Classic Auto loader, which had a few drawbacks. What is Cyborg's features is the ability to Define load hooks for any constant at loads? We can use it to reference cuz I work all loaded code including classes for external engines. Using a night work on load. Call back. We can defer initialization until the class is referenced. This saves a lot of time in code loading and also any upfront costs with initialization.

After updating the initializer to use on load, callbacks we can see we've shared off about half a second. Open 7 to huge Improvement. It becomes really important to be conscious of what constants are referencing in large code basis. Now. Most of our boot time is taken up by rails, which is good. But you notice another place we're code. Loading is slowing boot down this time. It's in the apps monkey patches. This is what are patches look like. Which of these Constant Sorrow, Dilaudid which means of Ruby will wait to load them until they referenced? We need

to wrap the first patch in a call back to wait for active storage to event or load pads. For the first time, we can replace the to prepare. Call back with a slight work on load, talk, like we did before. However, this won't work for Action controller, inactive record. The problem is these classes don't use that word. It turns out our loading and reeling and constants can be somewhat confusing. It's a big reason why reals is regarded as a magic framework. If we take a look at the guide for loading Constance, we can see

mention of an active support on load hook. Active support on low ducks. Look like this. You can hook onto a load event by name and the block will be evaluated when called. Take a closer look at rails. We can see these load hug strewn throughout the framework. Typically the Define to the bottom of core class files. As you can see by this crap. There's a lot of books to choose from. An important note to make here is auto loading. Isn't a feature exclusive. Does that work or rails?

Ruby allows you to Auto load any code with a standard libraries, autoload method. Is that work actually uses rubies out of line method under the hood? The difference is I work automate the process of auto loading by defining load pads and using file naming conventions. With active supports on load hooks in place. We can see we're able to defer active record an action controller loading. You may notice a key difference in style between these callbacks. Is that word called backs are aren't evaluated in the context of class. Where is active

support? Call backs are. Taking a post profile. We can verify were down about 150, s not a huge win, but these sorts of code, loading issues can really add up. Rapid PD in both development and production. Both you and your users are happy. Life is good. Until one day, you receive a complaint about big shopping carts, being slow to load. No problem. You think to yourself? You're quite good at fixing speed. Reductions at this point. Bring It On. You try and try but you can't reproduce the problem locally at this point. You start to sweat sometimes performance problems. I can be hard to track

down locally. What situations like these profiling can be used on the Floyd production systems? But take a look at how the instrument production profiling with rack. Many profiler. We can authorize a profiling in production with the authorized request method. It has no effect in development because profiling is always authorized in development. What time is it into our application controller? We need to pair it with some kind of authentication method. If your app has the concept

of Administrators, this is pretty easy. Our app doesn't so we're going to need to permit, specific, static IP instead. Will you could use a simple hdp off or some kind of identity management whatever you have available. After building up a big cart and profiling, the page. You can see an issue stemming from the cart item model. Specifically, it looks like the product Association. Take a look at our card view, reminder, a card fan, partial for each item. We loaded product.

This essentially means the page will execute a query for every item in our cart. Some more experienced developers will know. This is an n + 1 query. I know we have a rough idea of the problem. We could start looking for a solution. If we search, the active record clearing guide will find the includes method will solve our problem. If a cartoon kodinar controller looks like this. We can use the includes method on the model class to ensure. The project Association is loaded with the least amount of calories.

Switching back to development Moon that we can see a change and even small carts. This is a good indicator, are fixed will help with our production was. But what if we wanted to prove our change, doesn't Buckle under large amounts of data? A good full deliver Jared. Benchmarking benchmarking. We can easily measure performance changes between quote pads or methods. As it turns out. I railed also as a tool to help us out here. The real generous Benchmark command, does everything you need to start benchmarking,

your own code? The Benchmark generator was added in real 6.1, but you can get the same effect by creating Benchmark scripts by hand. You'll notice when running the command add something to your junk file. This is The Benchmark IPS Jam. Benchmarking is an expensive topic worthy of its own talk. So I'll keep this brief. The IPS in Benchmark, IDs stands for iterations per second. The gym is actually runs the code blocks, you give it as many times as possible and counts. How many times the blocks were able to run?

If we open the generated script, it looks like this. We can see we're using the Benchmark IPS Gen 2 Test 2 code locks named before. And after. If we Define a test card with products in a method for clearing, we should be able to accurately compare letting methods. Because this is just a test. We don't actually want to process these records, these situations transactions, our friend. We can wrap her operations in Lock and tell active record to roll back afterwards. This or avert any changes, we make to our development database.

Turn off if we run the script will see something like this. Welding with includes is about 10 times faster. If we increase the cart item account, the savings only get better. Even looking queries is definitely worth the extra code. Done with that. We reach the end of our story. You got a lightning-fast rails app, and we've learned a few important lessons. You can use Rec 90, profiler memory, profiler. Zach Braff. And Suites go to find performance problems anywhere in your rails app.

You should do the acting job to defer work from the request response cycle. She's cashing to do expense of work, once you can reuse it. Later. Are the make sense in production and not make sense in development or test? You should bypass spring for more accurate boot time profiles. You should memory profile complex operations to try to minimize on allegations. Be aware of the code that you're loading and he's called back. So unnecessary. You should use production profiling to arrive at Solutions faster.

And you can use benchmarking to assert speed differences between blocks of code. Any of these issues in this presentation were based off of real code? The Apple worked on is available on GitHub, a free, a reference, including some bonus content. Check the description for the link and links to other web pages. I referenced in the stock. I'll end up with some. Thanks. Thank you to Ruby Central for allowing me to present the stock. Thank you to Shopify and my colleagues for supporting me through the making this talk. Thank you to all

the maintainers of the great James. We talked about today, and thank you for watching. I hope you learned something and I hope I've inspired you to try profiling with your rails app.

Cackle comments for the website

Buy this talk

Access to the talk “Profiling to make your Rails app faster - Gannon McGibbon”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “RailsConf 2021”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “IT & Technology”?

You might be interested in videos from this event

November 9 - 17, 2020
Online
50
93
future of ux, behavioral science, design engineering, design systems, design thinking process, new product, partnership, product design, the global experience summit 2020, ux research

Similar talks

Jesse Spevack
Staff Engineer at Ibotta, Inc.
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Jamie Gaskins
Masters of Arts in Christian Apologetics at Trident Technical College
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Darren Broemmer
Freelance Developer Evangelist & Software Engineer at The Broemmer Group LLC
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video
Access to the talk “Profiling to make your Rails app faster - Gannon McGibbon”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
949 conferences
37757 speakers
14408 hours of content