About the talk
Monitoring your Postgres database is important. But its actually not as easy as it sounds. Postgres itself does not expose enough information, or sometimes data is exposed that can be misinterpreted.
In this session we'll spend time looking at specific cases that are blank spots on the map right now, where one either doesn't know what Postgres is up to, or where you have to use system-level tools and other creative methods to get information.
Some examples of what we'll take a look at: The deceptive value of buffer hit counters, the multiple efforts to bring planning data to pg_stat_statements, and why you often have to resort to "perf" to identify performance bottlenecks.
You'll leave this session with more ideas on how to work around these short-comings, as well as an understanding of where development effort is going to fix some of the issues.
Good morning, I'm Lucas. And today, I'm going to be talking to post a smart right now, you know, I wish I could give the talking Auto, the unfortunate circumstances change. So today, I'll give you the full version of this talk. We'll talk later about what is missing and plus monitoring. So that means what is the functionality that should be there. That would make most people's life easier but it's not there today. Some of that actually got as opposed to 13, is a mix of things that are there in 14 and things are not yet. They are,
you know, that would be things that rocks could work on their hacking, a PostScript health. Now, what's about myself? So, I'm Lucas ice work. Supposed to get for many years now including working within the context of running cloud. Provider pacifically asked date of his supposed to work. Supposed to call the scaling out using side as extension, for starting now, Obviously with Pajamas, we seen many changes over the years. And so, you know, I hope to keep learning from the community about how to make focus. Better else to give back and
explain to folks how I person supposed to be as much are evolve over time. Now to start letting me take a step back and save what are deposit plus monitoring, right? So if we looking for Smith monitoring as a whole, what are the things that are challenging today? If I would say there's three main things that I seen it, so, often incomplete, right? So we see one aspect of the system. We don't see the complete system. Also we don't you know like not everybody has access to DeMontrond data like if an application
development company it is not that easy to get full accident, understand everything, I'm feeling a lot of Tulsa, educational, not just an uprising features. And last but not least contain sensitive information. So this is, you know, what challenge we were if you are. And you know how much for setting for example, you conscious give the post with locks to everybody on the team because there might be patient. They let their Now for today, we are going to focus on what is in completely opposed to Smart turning. So really, you know, this is about one of the things that I'm not there yet,
that should be added. As, for example, poster statistics views, there's kind five main categories that I would like to talk to him today. First of all, I'm going to talk with connection connection, handling security. Planning, Ohio with them. Then I'll look at the curry execution. So we'll look at the Active. Curry's Astoria queries of how Kerala Curry get service in monitoring data and failures could have them on the secret life person, smile off, timer.
Next we'll look at the Shared resources. These are things like a box Kayla and xx's organ or else the metrics a CPU. I own memory will. Also look about volatility has 14 for the surfacing that better and last will look and maintenance to look at the utility commands will look at all the vacuum. When I look at the back of the things that are there today that, you know, might need some more adjustments or the features that are new in Port Authority. Let's get started. So first of all connection, Handler.
So on the connection handling side. Most of you are familiar with female priests activity which name of the core review. I would say, you know, if you look at the post as monitoring they don't General Plastics activity has so much valuable information in such a small table. Just so, you know, a lot of important log events, around those things established in a certain kind of issues that have not already, a lot of data to look at it. The thing that I've seen of messing around can I see no connection handling and you know how when you before you get strong, like what's happening
postgres? It's really about the actual late to see not being obvious to users and knocking some food. You can really track on service that heater right? This minute What's good example rights? Like let's say I have my application in a certain Data Center and around like database and not or maybe Jason day Center. If I you know, look at this from application performance perspective, whether the connection latency between the application database is in a multiple milliseconds or the execution, time of the server is multiple seconds. They look the same to the application, just waiting
for the results. And so the challenge here is that if we don't explain this, well, sometimes you do things, get attribute it to speak slow when I actually was happening is that the network itself is a problem and you know this gets worse for example, if you don't run with us a, you run your laptop on your laptop and you you say post office is slow but really what slow is your laptop Reno connecting over? Local broadband connection. Now, what's interesting here is that there are some commands to post rested. Actually give you the country. No clients. I'd like to see
how much timing It would actually give me the information about, you know, how long is the curry? Take end-to-end, including the client latency for the flashlights. However. As you know, not obvious, bring up three numbers, right? It could be split up until you can actually see the career planning and execution. Right, let's talk about Connections security for a moment. So on the types of security, you obviously have seen of you is like peaches that a selfish
set the Jesus API for car bras. And the locksmiths in this really missing protective security isn't agri-view off who has tried to access my database, right? So I can get that information if I've Parts the locked files and really make sure that, you know, every log events gets categorize correctly, but there isn't a good summary of you. And if I have a lot of activities, easy to miss individual events. And so, you know, having something simple as this user has locked into something times. I just a comfortable in your mouth, or this PGH PA line is matched by, it's like
somebody logging of trust among the kitchen. Right now is pretty much invisible. And that's, you know, something good is a problem. Like, if I run a security sensor system, All right. Next. What about your planet are planning? Obviously is very important. That every curve you get spoiled at some point. Now, I feel like we're planning on PC explain is our of the main Swiss Army knives are, there are two good improvements here for 6:13. There is Saks of you now. See, the buffer server being used for planning as part of explain outfit. And then the thing I'm really excited about is that
piece that statements now, she'll still planning time. So that means that instead segments, that can get active. You first look at the buffers first. So here, we can see that. This is not the right lane and lights out. Put the execution time, too. Actually, the only 0.4 milliseconds. But the planning time was 45 milliseconds to Almost 100% of the actual kind of Reno, If your time was listening to find it and that seems to happen because the planning of fire was accessing the disk
and that this guy who was slowing and so you know just so you know which index is could be used for the theory or which tables would be on both the. That took so much time and so I will see if you're mad again it will be cash price appointment with the lower but this is very helpful to understand where the slowness came from. And now, you know, on such statements, you can get the aggregate planning time across all statements. So that means, you know, make it easier decision. What cures to focus on rides. If you had something
folks, used to think the problem of course, is that is it's playing Handel's outbreak of the planet has the valley to lock my Nexus. Turn stand what, what kind of do you flies in house tax due to Ferry? And so this would be something for her. If you had the same things that you do with 14, you actually little seeds in particular. He's the kind of man you are slow, but kind of get surface rights. In this case you for some flat top. Sit surface x 5 milliseconds,
* 0.4 seconds, execution time. So they can you do a really strong difference here? If you wouldn't have seen this before, 7? No, I think one thing that's still a missing with country. No, do all the information, we got no plans, we don't know. We know how long it takes which is really critical information that, you know, wasn't there. Before we still don't know what kind of plans could generate it ice outside of looking at all to explain and getting the samples that you can collect. Father, explain. It's really difficult to understand what kind of creepy when it's a
gift, create a new database on a kind of summarize, aggregate basis. Griffin a couple of efforts here are mostly extensions to bring the Village People stress or complications. Have plans for many years ago with second floor. Where which eventually friends. Which Horseman is the one I'm insane. It's not safe for production so I couldn't use it. But the physical support, while there's also a PG store plans, which will see no effort believes was in an identity in Japan. Where do you know, essentially built in what year? But you know, could be used
again for looking at the planet formation aggregate basis. However, I was like nothing missing for the last couple years and so it's not something I would use a brush them. Or we something interesting efforts began to bring the supposed to score as part of this as somebody did writing such call P G, stats table plans, which tries to again, you have an aggregate plan information. But you know what, it's like not something I would use a brush today either. I think the summer here is, you know, those are not production-ready and there isn't something that commercial IT decor. And, you know,
to ask this none of the cloud providers today offer any fees extensions price. If you run your database on the date of birth of the service provider you couldn't use them. Anyway, All right, so if you don't bigger section hand with say on crew execution. So I spent a lot of steinacker execution. Like this is really where we focus on feet and last Riser on Gilcrease performance tuning. And so, this is where, you know, I work closely different aspects of us. Now, if you look at the Active, Curry's
Peak, start every activity, is something that, you know, is, this is such a thing of you, but gives us a lot of data. Writings of here on Netflix. Tells us, you know, what it started running section that was in start running and the dating ads that are active. Can I swing here is 48 events, is the we actually have a lot more now. It's 14th. So you know, there's still some new way to mass of an atom is also email or fax me. I would say on the different play different names as it does now is much easier to understand.
My problem is if you roll the monitoring tool or you familiar with name supposed to start in a couple to go back and check and see that you know the name is actually stole them. One thing that I I really miss here is being able to understand, you know, if something is not on the date event but it's active, what is going on? If you look at our list of the events here, for example is empty. Except for that one question. Do you know what's waiting for a client? And so just as you in awhile so the peas restores going on. So you can see the 34th
parallel. Kind of work is going here. They're all the active State and their old have no way to. That's so if I try to understand, you know, where that like, what is, how can I improve on this ride, like should I, you know, maybe reorder the table. You know, maybe it's the indexes that I've already created that I should have created after the ReStore in so long and so impossible to tell if there's no other way to do this today, the way you can do this today, if you run in your own virtual machine like I'm doing the system used to Lenox Park, right? So we can actually
Little Express come and run for some poor top of record which is usually recommended, which gets you across your whole system. What are the functions that are executed both in Wii U systems, as well as Colonel space because you're like a top down view of everything? 2 here? We can see, right? It's obviously the closest coffee is the busiest. Now since we've actually few details here we can see that you know one of the things that is very busy, is the kind of thing. So function has copied it into postgres. Do you spell conversions Arc
of one of the problems here? And then we probably come fix this but it does help us understand. You know, maybe if we have a different table would go, faster ride or like, is this something that you don't have a few more like our love Frets on to it? Maybe. How much related to describe Carter smart track, right? So distant and I will get right. So you're not just trying to understand what's happening right now. He was trying to stand Is the score going to run for 5 hours? Is it going to finish the next minute? It's actually the data warehouse in case or maybe you'll bring something for
Grayson. It's often very difficult to understand how long will keep running until there are efforts underway to to do some of this, you know, as an Ascension cybertech for example, recently-released TV show plans, which we know. Isn't that better than I would say? You know how big these are all fairly recent developments to post a sprawls. Hasn't mentioned are personal be careful if I'm running my production. I'm sure folks who run that put the IRS thing at this point. Pretty invasive. And how do you propose dress? Alright, let's talk with historic fairies
in this comes you know to my one of my favorite Essentials which of these statements are statements of witnesses for a while and I think it really did you know, change the scenery in terms of what is available and what kind of day do you can get and usually I would use the same connection with you know the book Harry notices statement explained. Number one thing that I should do better and that I'm awesome. You know, I think this is a disconnect between the people ride in the database like
creating the source, I would like the creme de Coke, and the folks actually using it at this, right? Because I think, Most of the hackers don't they see themselves, you know? Right me don't see fall and they think or ends are not the best idea, right? Like you should really be thinking about which sequel you sending it to and so I think the problem is that that's not the reality, right? Like a reality is there are so many different applications out there so much different developers and the use of oranges, write this database that they're not Like an officer of the experts riding
his bike off in time, so you don't. But they still want to know what is slow about my database. Like, what do heat index price of these things are still on for today? And so, one thing I see is war, I'm just like to write, you know, complex. Stand amongst them is not to select star from users and losers and you say, we're ID in writing. You have like, I T 192 C3. It's essentially it's You don't know right thing that long, a long list and so handsome and so that is something that's sap off hands. Well because it generates 1 and 3/4 inch ID
And don't get the wrong side, piece, might behave differently, right? To the performance might be different the same way that a clear plan could be different even if you sent him. Whatever the problem is the default settings. Give you 5,000 interest. That means, if you have me know what's a hundred, different variations of fragments from the radiation. So it's only 10% of your priests. The statement space is taken up but it's one query just format it differently or kind of know it was different variants. And so I think that's a bad experience so I could send you
off and get the in a very small, but important statements ticket, no longer visible What is the Samsung 8 plus with could improve on right? Suppose could, for example, not handle and Lift Away does right now like he could be grouped together and do you know more detailed analysis could be kind of deferred for Panama Canal. Beautiful thing, you know, what time is coming to 11 ounces, is linking with that statement output with you. No other like, views or logs is difficult because such statements, you know, security sources
like a normal estrogen. And then also has a very idea which doesn't show up anywhere else if I was Chris. And so making sense of that. And let's say you find the slope Theory using stats a fence and then I want sexually go and you do look at the posters logs and find all the author explains the preference. The clear Eid today, I can't do that. Like, there's no way to do this and so this is I think something got absolutely needs to be out of the post office, your friend efforts to, you know, how to attach those days off been fruitful so far. But I have hope that the left something
there. 40. And then kind of related to this, right? So looking at what is problem? We're trying to solve nothing. Oftentimes the way that the application developers. Look at these problems, right? They're not about to slow like six statement. They're thinking about which customer that has no bad experience or which web request is I'm slow. So just comes down to is that statement Cena, really only differentiating based on a fury IDs but not based on any other criteria. And so you know, if you look at this and time flow
perspective, What's a n on one side because I have a customer, right? And so we take that thing out, customers web request was sent to the database server and then we execute that sequel Stapleton and then essentially going home. Now, to answer a question like, which customers were affected by slow. My side, note has been slow. Problem because in the arts district authorities, my highest paying for a customer. That's important question. Also, you know, things like I
have my ATM for example, and I want to find out is there and explain planned for this. Play, that's not really straight forward to do. I know the things that I've been in both this to try to improve this. So 40 site is extension specifically, we felt the extension to PeaceHealth a physical sites. For we said, 40, multi-tenant to skates rights for your starting new database, by 10 and I be your customer ID, we actually know provide 2 /, 10, statistics off music. This will then help you understand. Also not just,
you know, like how many how much activity does each customer have else helps? You make the decision, like, should I move this really ACT customer to their own machine. Dedicate resources, right? Just keep them on a multi-tenant architecture. Second of all pointing that we've done automatically writes for something else, is an extension cord marginalia. Marginalia at the base camp team belt for my sequel initially but also use for the focus. And so that's a no
changes the rails back to factory Lauren to always inject and for custom comment into each circle. Rounds that says, you know, this is the line of code for discreet comes from this kind of thing out, the trailer, and grills case. And then this is a request ID, this means that I can now go and look for this request that he and my post was locked and I will see all the auto explain output by quickly summarize. And then kind of fun is like everybody seems to keep inventing the right ever since posters
has the vitamins added different providers and female Riders and everybody keeps inventing late, identification example, but I sure does or what are the guests on your data as part of their monitoring product? For example, also shows you just kind of bored and everybody doesn't like the different and so I think would benefit everybody, you know, if there was one way to do this, at least one we took at the data, right? Let's not sell the best visualization, but the fact that, you know, everybody here is Sam playing with their own code. I'm sure it's Sarah cronkright, I'm sure
it does things wrong. In some cases to having a community Blessed Virgin that that's a deal and aggregation notches, You know, like showing private events historical somehow would be very helpful. All right, let's look at our love. It's a person Porton, right? So the free helps us understand our. We don't confuse or not. I think you know the post is 14 as an eye suspension year. Where if you have Carla Curry actively running, it will show you in pieces that activity which queries related are the leader PID call. Now you can clearly say this back on Burke hurt his Peril of our crew
is actually because of this new process or Maine connection Second of all until 6:14 you have a good number of improvements around the how explain shows Carla workers losing data for each worker, has a sword information for each worker to fix it to Jason output apartment. Now, I think the thing that's really missing with garlic. And that I see no found challenging myself. When I do a system, is it's hard to see on that great bases are McFlurries actually using garlic bread. I think it's a look at the system and I don't know, look at it pretty clear. Just want to see
on Aggregate and my using Carol fear. Not it's really hard to tell tomorrow. Leave. It's really hard to say if I have configured to parallel workers, correct, right. If I have a lot of parallel activity, what I see happening there are going to be friends, but they're not going to be able to be executed because that's something right now. It's not very obvious. What's the place is nice and proof of here in 13 around us fail, is important things to look at it until 12:14. If you use the exact fare for the call,
right before you have a division, my cigarette are, I suppose example and you use extended for your phone call to pass the variable separately. And so what happens is that because you're passing separately writes, the statement text itself as, like, dollar wandered, all over to, it doesn't actually have no actual values. And so if you didn't lock me directions to 100 will give you the parameter. However, on our case, it did not now starting. It's 14, there's a new setting. Did you get a table that actually, you know, remember seized power meters, in the case that tells
you this with the heiress as he knows that occurred with parametres sound. So this is also useful if you have time out, right? If I have a timeout set to like 10 seconds, I forgot. I killed. This helps me understand. Which customers were affected. What was he? So I think there's a few things last year, not too much. To go to think, will look a little bit. What are the shared resources in postgres? Just different kinds of things that we could look at here. What starts with locks so
nervous? And obviously Phu loc says of you, you know, it's been there for a while its you know, pretty straightforward and send. There's also many log events, right? That you could monitor, which often times I find more useful to Speedy. Locz is kind of sleeping information. Like they want me to look at part is really missing with PG locks if there's no active you right, there is nothing and tells me in the past locking has been issued. And so let's imagine peaches that statements had a long wait I'm calling, right? So it would tell me this
particular statement has been waiting a lox a la Friday so that I can understand. Why should I optimize? You know, what should I look for in a lock that she has a problem because it's not always that you know, a lot take long to the second. It might just be that soon as fast enough but you know, 450 North. Looking at a table in the x-axis, right? There's a lot of information already out there and I think that's, you know, what, you can make a lot of use about this thing. This thing that I found missing
as it. Look at the optimizing index is, for something of which index is to create is, which statements are experiencing, you know, you do all the explaining for, very protective cases, but similarly, to a planning time counter, it would be very helpful. If we had eight index kind of squashes can counter. That just tells me roughly what to look forward to look at Now, and see if you I own memory, there was a lot of details that I can get just going into the system rights. If I run popper, a stock boy, look at
the clock project backboard and usually can get the good kind of the metrics. There's some. Instead of using Force Base itself as well. That's really missing on the memory side is connection music, memory usage rights. Knowing how much memory each can I quickly so that I can understand some shooting workmen in particular, I should be when I'm running into issues of temporary files, I should raise my work around the winch. Race in the workmanlike. Should it be in Alhambra to be 500? The only way to truly know would be once you run into out of memory issues and so showing
clearly how much memories connection uses wood burkeville to him, help customers, and users optimize this early. And then here's a nice thing that's got added until 6:14 to help on memory usage side which is you know obviously shirt buffers kind of know the locations that are now visible clearly in a new pc sharp pain medications. You just, you know, East to say then this is why post was using this much memory. And then again something really really excited about the
13th is you can now not too sketchy. No details on the read sight of you know of our clothes but you can get a lot of details on the right side of a boat ride. So when I have a right statement of the times yet for work doesn't happen in the stickman execution Selfridge. Like the actual work happens really afterwards where, you know, the files get flash disk and wall gets written in that had later on right and so forth. Like it's replicated for example. And so let's go to add in 14 is a lot of new information about how much wall does each statement generator, how much wealth is
automatically generate and seduce? That helps to optimize the country could use your application like by optimizing your balls generation. Sure, you know, to look at the news that statement information with three new calls here is information about how many records of all records. Each day been generated. We can formation about how many full page and images of Jenna, and then help me by such an amazing example. As was looking at the sun tax database. I was actually surprised to see that I had forgotten to make the temperate able unlocked. I think of these tables do not need to be
replicated or, you know, for crafts with heart rate, they don't matter. That's a really nice will be unlocked so that they don't cause wall to be generated and then I could already know significantly improve the performance. Similarly, when I look at all the vacuum, now I can see clearly how much wall on the back and processes generate. So let's say you have an idle system or you think you have my system and suddenly you see you know spiky ball you don't like why is he on the ball sack? So this will not help me understand how much wall each automatron Rex.
If you look at this number, in particular, case, you now controls the sea for each explained, if you run home with walk-ins generate. So always remember if you explained and lies on a modifying State, then make sure to do in the transaction, then roll it back cuz I actually was out date today. And now can you say explain how wise wall and then I see how this could be up this statement, how many records and how many full day images and how many bytes does it generate? Aren't last last time section here which is about maintenance, right?
So This already got much, much better and you know 11 and 12 Lexapro gets back to my things really started. The trend here in 13, we know it was a step for was analyzed. This will help you if you actually run the fire before we have and lice of process used to take a lot of CPU, this hole. Now I'm going to kind of help me understand. We're already at On all that came right. There is so much information already out there. Surprisingly a lot of folks, still have challenge to all the vacuum or just
thinking of you speaks to the like, how to make it easier for freezer is not Rusty. No more information in that regard. I would say the one thing that could be helpful, right? To make more folks, Junior. All the vacuum more easily having lunch with the log events and not just a piece of that Fortress vacuum, but having like an aggregate for you, that's not just you know what happened or how much, you know, how long did it take an average? Or if there's certain conditions like to pose not being removed because there's elf
transaction and that is something. I think that we couldn't prove sit at more folks to make make you sad. And then this is a side note to the useful one, which is cheaper. Now, tractor back up, progress rights in 13th, I can actually now say a stack of running. I can see the different phases, right? So I can see the bass back up. For example, waiting for checkpoint, I can also see, you know, kind of as the files of being screen where the progress is at and how long is roughly will take. Is the best, really it, right? I think we, we talked
about a lot of different things that are hanging up missing from both Chris and here's the full list, encourage you. To also, you know, look on the heck is mailing list. There's many things that I did not include, I think these are my personal assessment. These are the most important ones, but I'd be excited to talk more. Also you know helps has to post crispy the one came out last week. I think there's a lot of good monitoring programs here. We are Porch Looking child support for those speech, analyzed as well, but you know, many of those are stand-alone so you can just meet up. Let me get
that. If it from them. Thank you so much for the time. I hope you have a wonderful day and I'm curious to hear any questions. The folks have And we're back with Lucas for the q-and-a session. Go ahead. Lucas, thank you, perfect. Sorry folks were to pick up there for a moment, so if you talk to your own history, question earlier about There we go. We had a question earlier about how would something like pgstats hva file rules handle. The PGH PA on file changing, right? To, this is a question related to like, one of my earlier kind of points around,
you know, there aren't any security event insensitive. There is no, no, no way to know. Historically like, how many folks extra locks into the database, it's a reflection of you today. Already that shows you, the current HBA rules, but it doesn't actually give you the statistics. As I think the point was trying to make it that really have a thing would begin o. C. H. B. A rule. And then we just have a counter to goes up to tell. You know, this rule has matched this many times. I understand, you know, if I have a rule, what's the cheapest security sensitive like, you
know, somebody logging in local of trust, then you could, you know, quickly like verify that your systems and she didn't use that fried. So if I know that you know, I had some kind of pay taxes. I didn't expect I quickly go check the van all the way to access that, since my last time I checked for Less reset is actually in a fully authorized by the way. So I think you know something else. You know, maybe I'll work on it for 14, maybe you can all the folks
yet. But I think it's it's it's important thing. From the security monitoring perspective, I'll just text you something else to mention, you know, it looks like a book is 14 as many could change its monitoring. So I can prove this, right? So just do what repeat again when I initially started. Like we're kind of talk. I thought that you were a lot of things that are missing. They're not there yet and then reality, I think. So, actually landed a lot of
Buy this talk
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.