Experienced Software Engineer involved with PostgreSQL for about 7 years contributing features, reviewing, bug fixing etc. Participated in developing features like multiple synchronous replication, making wal segment size configurable at initdb time, default partition, runtime partition pruning.View the profile
About the talk
A native partitioning system was added in PostgreSQL 10 and it has since evolved considerably. In PostgreSQL 13 the existing Postgres benchmarking tool pgbench was extended to allow testing for range and hash partitions.
Choosing a good partition strategy is not a child's play. Partition done wrong can lead to massive performance deterioration.
This talk will summarise all the features and key improvements across versions including the new pgbench partition options. It will also attempt to reveal when using partition a good idea and when they can cause the performance to nose-dive with the help of benchmarking results using partitions in different settings.
Go ahead. So let us continue. I'll just give a brief introduction about me for those who are not there. I'm a senior software developer at Enterprise DB. And I'm going to be talking about petitioning, I'm going to give us an idea of what are the features that was basically for a while. And then also give some performance numbers on how it behaves in various scenarios. So let's begin of partitioning, what are the key terms that we use in the petition terminologies and then be partisan benefits that we can get by using this feature? So what exactly is
partitioning partitioning is simply what the name suggests. You are going to divide data of a particular table into smaller parts of smaller tables. But in this case, what the application of usable, see still a single table, he will still be fighting on a single table, but internally you are able to Target small, different tables are individually, which gives you a lot of benefits or in various scenarios. No, the ketones when it comes to petitioning. The first one I will say is a partition table to partition table is table in which, which itself will not
contain any rules or but it's simply redirect the data to child tables with a full partitions. And the partition key is basically a column or multiple columns based on Whose values will determine which partition despotic rule, will decide in, which determines the rules that are contained in the petition. I can do in this particular example, the first partition can contain rules that have a loose from 1 to 10,000. That's a partition bound of that particular
position. So that is Patterson bones and it is different for each partition. I'm sorry. So a few of the partition benefits you could say is data segregation kind of provides of segregation. You can divide database run time and do different qualities. So if it happened that you would have some data which is more frequently accessed than some other data so you can move. Those must be trained so that the performance of the overall system is quite good. Random. Yeah. I'm sorry. So the second one is maintenance
tool. You have to go for more, regular maintenance activity data to keep the system healthy and having partitioning alamls you to perform maintenance activities on a smaller set of data and you can perform at 5. So, in that case would actually be able to perform activities much faster than when you had to perform in the whole big table. Obviously because of this you can have the database to be better, scalable and also have better performances. So, back to start of partitioning with
the supporters in postgresql, Wonder is the inheritance or trigger based petition, which was there. Since nine point one and then the new were declared to partitioning which has recently introduced when was Red Skelton. The problem with the entire inheritance, a trick of his petition was that it was manual in that you have to create trigger on the parent table. I also or chickens when do the tile table and how to manually maintain all this data. So there is no way. You can have an automatic check to see whether there is or the constraints of Vijaya petitions
are mutually exclusive. And hence it was very hard to maintain much are front because if you have to remove part of the trigger and then remove the check constraint and there's a lot of stuff that you need to do to perform a simple operation. So to get over this decline to partitioning provide some more than Automation in creating a petition data and it has a simpler Centex and hence, you how you're able to manage the partition easily because the mutual exclusion is a handle. If you are providing a Mission with Olaf any other data than it would to an error
and you no longer have to do the triggers. All the check constraints. That is kind of the begin behind the scenes thing that happens right now. The three partition strategies that was basically support a list range and hash. So when you talk about petitioning, this was introduced in postgres. Skeleton in this for each of the partition bounds is the, you have to explicitly mention the values, that each partition can hold a candle. To example, one person could hold values for LG and another can hold for Mumbai and Goa. So you can buy one, or
multiple values that a particular partition can hold. The next is range. Partitioning this again was introduced in postgresql 10 or in this. You specify a range of values that particular partition can hold and the news on Spotify of a partition range. So sitting from hundred to Five Hundred. So that particular partition can hold rules, which have the value hundred, but not of 500. Go to Lori bounds inclusive, but the APA Bangles, not to and bounded values that has. Meanwhile, you and Max value which is exactly what it says that
I'm about you and the maximum value. So it is basically, I kind of invited to get to vent. Yeah, the Third Kind of partitioning partitioning. This is introduced in postgresql 11:00. And in this you basically Randall value in their hash function or basically divided and then depending on the remainder that you get, you would determine which partition this rule will go into. So this is kind of ghetto don't have a very logical of a way in which you can create a
petition and all those cases where if you have to give a list of partitioning that are you getting multiple small, very small petitions. We would rather have less number of partitions but the data has to be evenly arranged. So in those cases that can help So this section, I'm just going to be talking about the various declarative partitioning send texts that you can use starting with how do you create partition table Auto Repair in table? So that the new clothes that has been introduced is Partition by. So you
just ride the create table, and as with all the parameters that you have to specify and then you just say Partition by and then mention whatever strategy that you would like to use for partitioning and died. But that particular table, and I feel like I have given this example for Lestrange and has to, basically I'm just stated the create table, then, whatever restrictions you have a, do you have a constraint for anything? You can just mention of a Kind, the new ended by saying, petition by list. And then you mention a partition key. In this case is just call him handsomely for range
and hash or simply write the strategy name, whatever you want. Change o. Hash Now to create a partition, you have the glass partition off between mention which table style dresses, and then you mentioned for values and then specify the partition bones of that particular Edition, So, what list you say? Okay, table partition of the parent table, then for values in, and then you specify each of the values that this particular position, you would say, I would say, from the lower Bound for that particular position
and to the upper bound of that particular position. So, in this case, this petition will accept values from 10 to 20 all four hash you would have to specify the modelers and the remainder, for this particular partition using the keyboard with a specified in the last So there's a possibility for you. You might already have a table, which contains a particular data and you would like it to be a partition of another already partition table. So you don't need to copy data gain or anything. Can simply just add the table to the
partition table, as a new Partition statement helps you just stay at the table with OpenTable. You are considering I meant to say, attached partition. And then you mentioned the table name, the partitioning, and then for values. And then you mentioned the partition bounce, like you had done previously, similarly, you do it. So, I listen, example, for all the three cases here so that you can use it. One thing in this is whenever you try to Ida partition, all the party of all the data in that particular, partition will be
checked, whether it meets the partition balls that have specified. So if you're good, How was your last table with your going to be hiding as a partisan? It can take quite some time because all the entries are going to be checked, whether it is meeting the particular pound. So for example, I have attached the new Phyllis store Phyllis that I don't see list to, with the values Mumbai. So it is going to be checking to see whether actually all the interest do have a lima bean bag suppose you accidentally had a rule, which had been
here. So the amount of time, and we'll take will depend on the amount of entries you have in that particular table. No, no way. You can create a petition to refuse the partition of command. So it is already a petition. But you already have a table that you had a standard table for some reason which had the same character, same rules that high. And then you suddenly wanted to be part of a petition 3 Okay, so are you. You don't have to create a table to make it a petition,
the petition, and that's creating a new Partition. Then you can use the partition of command, which I talked about previously. So, yeah, the partition suppose for some reason you no longer want a particulate able to get partition, but you wanted to be a standalone table, then you can use the device partition command and remove it from the partition Tree. In this case, it will no longer be restricted to the petition bound to suppose you remove the petition which was taking Mumbai previously. You have said the dead detach so you can now and make any interest in food. Allergy or
Kolkata, r18 can put all entries and that because it will no longer have that package inbound restriction. But all the restrictions like the primary key. And if he goes there to hide. Did Ivan. It was part of the petition tree. It will still be present if you no longer want to stay there functional. When it as a standalone partition, you will have to manually remove each of those restrictions supposed to say, this is detached in this case, the data is maintained by that, you are no longer part of the petition. Tree was supposed to no longer. If you want to see the new girl,
as normal, you can use a drop table command to remove the partition, so it will also detected from the partition partition. Okay, so I'm sorry. This going faster with Yes, go again. This is just want to be talking about multi-column partition. Like I said, the partition key can contain multiple columns and also multi-level after your phone, the petitioner will again petition, the table. So Looking at multi-column partitioning. This is supported only for range and hash type of partitioning and the number of column limit is 32
All in multi-column range, partitioning whenever you specify any number of rose for each of those columns. For each of that column, you have to mention a lower bound and upper bound. I have a partition by three columns, and trees. I need apartment. Three entries. So this means that. So, call him, when I'm accepting that is from 10 to 500, to call him to from hundred to five hundred and four column 3 from 50 to 150. All in this case. So, what happens is that when you're using an unbounded value, that is Maxwell. And meanwhile you if
anyone knows those rules petition to call them has a longer value than all the other columns that follow, it would also have been bonded by you, like, in the first case, the second column has many values for the third column has to be men. Value Inn in the last example, since the First Column itself says that its maximum value, then all the other columns that follow the laws to be maximum volume. Mention a specific value for it later. Okay, it's so the big, the insert has performed for this particular petitioning. Is there a is there? A
comparison operator. So all the elements of that particular rule that is being inserted, good witch from left to right. I know the first unequal element it would stop so that you can see it or 250. This one will accept 01909. So you can say that for the second column 199 is not between 0 and 50, but this satisfies, the First Column, which is from zero to a hundred, you can enter into the into this partition. And similarly for the second 100 to 4900 the First Column is
equal. The second one is checked and that one satisfies the requirements. So this is also entered into this partition but hundred fifty. This, this particular value is not satisfying because On the second column is equal which is not allowed to cuz it has to be exclusive. You cannot enter 450 again hundred and one and ten second column satisfies the condition. The First Column does not. So you cannot enter into this partition. The second type of multi-column. Hash partitioning is the frame thing
as a harsh man in this, you know, you don't need to specify a different Bound for each of the petition Coca-Cola specify it once this happens because he has function is done for each of the columns involved and then combined to create a single Hospital, you depending on which you are going to determine the partition in which the Super Bowl has to go. So this is an example. This looks quite similar to what the previous case, where a single column was used because you no longer specify multiple values for multiple columns.
So, again, talking about multi-level partitioning, this is when you create a partition partition table, so I have already bought a child. So again, I'm going to create a data partition table. So, in this case, before you had mentioned Petition of parents and then you mention the bounds and you can continue this by saying, Partition by and then what about Saturday and partition three columns that you require for this? So This Way, continue to be partitioned So you can see at various levels of petitioning. So
recently, PGA events has added few options that enable you to Benchmark partition tables. So, there are two key words, which are being added, which is partitions and petition, the petition has to be a nonzero, integer value and partition method can be arranged or hash, by default, it was as big and you cannot mention partition method without specifying the partitions option. It went through a narrow table and one of the table speed events account, which is the largest create
that is, what is partition on the Eid column, which is also the primary care of that particular table. So I knew it create or range to partition using the Pee Dee bensch options. What it does is Waterworks can you specify that that amount of rows? It split across the partition equally. So suppose I use a scale one which creates one like rose and the number of partitions, the stents, then it would create of petitions which can hold 10000 values, each which is equally distributed. And the lower bound of the first
partition is missing value and the appointment of the last petition is Maxwell you. When you actually do the couple's can move across in the partitions without bring an aerosol can hold any values. I'm for hash partitioning. What's the phone number of partitions? You specified to be created that would be used in the model or function and then the remainder of your Fleet will be from 0 to what does petitions -1 again for scale 1 and petition contest and the first petition would have more diluted mean to zero and so on it'll go the remainder will go from zero to nine.
Plus 10 reminder 9. So this I had perform somebody close up test. So like I said, the PD Bank account stabilizes largest table which is created by Fiji bench or so. To enter such a huge data, it uses the command, copy to insert into it. So I had just been smack that to see how it performs. As you can see this, the x axis across various data is how it has performed, and the y-axis is basically the GPS, the data of each of the television talks, about how many petitions
I had created. In that particular case, So if you go to see as the data size increases, the amount of time that it takes, two of us to update. The system partition table is, it's about 20 to 25 per cent more than in the and petition case, which is denoted by the doctor red line. And, but if you go to see, you will see that. No matter what the buyer protection count is, it is almost in the data of the time. It takes to insert the data is almost sing. but when you do this for the hash partition, in this case of what happens is
that there is obviously a difference when you increase the data size, but you would also see that as you increase the number of partition it again greatly influences, the time that it takes to load the data. So, this happens because of the nature of Daylight by the ocean table, when you bring range, I owe you one in setting for shovel. The data is inserted in sequence. 12 does Onelife cruise is going to be insulting 1 2 3 2 1 lakh. So isn't working in secret. And as I had explained, the range partition
in range partitioned, partitioned rated in range, so it would have one to 10,000 first. So what happens when you are entering, the first partition is filled of all the date of the first petition is full. Then it switches when it enters data in the second partition and then it goes on. Vitaly partition for the high school because of the nature of the heart function. What happens is that the first in soda does value, bond has inserted into partition one, but value to is inserted into position to I'm pollutri, is inserted into partition 3 and this goes on, and then again after it
completes all the partition, it goes back to the first one answered. So what happened? All the partitions? You know, our individual tables. So the petition number of partitions, I had a bus from 200 to 500. So for 1st 506th, 500 times the number of tables at a switch just five hundred. So that's the reason why the amount of tables switching that happens is equal to the amount of data you have inserted, but in the packaging, gives the table switching only happen.
Dance position because the rains dinner. So, very best case scenario and he has the notes of a worst-case scenario. So if you're going to do a bike inside command, on a new partition table, then please see to it. That the data is it was for the range case with the different, or the time minimum. But even in the range case, that was about 20 to 25% of the creator of all the partition constant listermann check, which partition, this particular rule goes into a
little bit more time. So this was another times that I had before, like I said, or the primary partitions column has a primary key which is created by default. So I manually drop the stable to see how it was before, when you don't have an index on the petition, key ignition case, no matter what kind of position you do range or less, the behavior has similar. Because what happens is when you select star from piggy bank account where is equal to M, it would simply select one particle of petition and then scan that particular partition.
So as the x-axis did not see number of partitions. So as you can see, as I increase the number of partition, there is very good Improvement in performance when I'm using speak text, the wife has happened since because for the same amount of data if I'm going to create a large number of partitions, the amount of data in each particular partition reduces. And so I just got on a very low data. The performance increases as you increase the number of petitions in this case, so,
This is the been almost ready but just using the index, the primary index which exists in PG. Bunch of this has about 50 GB of data in Texas. So again, the greatest similar we are just now getting one particular drop in performance in this, when you are using the petition case but the range or has doesn't make much of a difference, but there's a 40% raw. And as you increase the number of partitions, slow decline in the performance as well. And then again is because of the overhead of handling. So many politicians and then checking out which partition to go for 4. Each time you fly the plane.
This is a custom query that I had check for. So I just reply to Target about 0.2% of the rules of in a 50 GB database button sequence. Like I said, you select star from PJ Banks accounts with a ID is between 191 like something like that. So it is checking for real queried arrange credit. If you would see that. The performance of range is very good, but hash seems to be in your body and petition case and then slowly decrease, as the number of partition increases know this is because in range case
you are targeting at most one or two petitions. But in the house case, you're not only having the partition overhead but you're also targeting all the petition. So that's the reason why you see this difference between orange and the hash petition case Soviet I'll be talking about a few other features which are supported and how they perform. Soda for partitioning, this was introduced in postgresql 11. This, this was introduced so that you are able to catch those couples which
does not fit in any other part of shoes that you have Define empire. Enter a rule, which contain the value of which has not specified for any of the petition should be throwing a row. No petition form. So the oven reduce its default supported for both lists and petitioning, and the way to created table, child edition of, the following table, and simplicity fault. One downside USA of this particular case is a game similar to the, a partition, which is
better for a new bone, is going to look over the default table, which you have created and check, whether it is going to violate, you know, any of the new Partition pounded that you have. I suppose, I've ordered a new petition from 2002. 2500 already has a rule which has the value 2207 to now, it has to be in this petition. So it is going to violate. The things was going to throw an error. So to address partition Suppose, there was no problem and you
could I tell you would spend time in scanning your default able to check the noodles are violated. So I had just done a small case, I had to default position in which I had just pulled one coronavirus. Should I took me 1382 to Simply add vitamin d-45 partition walls in your empty or very few rows and just took me to milliseconds because the amount of time to scan the table, London, partition pruning. This was also introduced in postgresql, 11. And this is performed at two levels at the executor initialisation and
then add the actual execution of the new parameter that has been ordered has enable partition pruning, which is default. It is switched on. So when I say executor initialization, it usually means that where you create a statement and the order to prepare plan. But when you are trying to execute the prepared statement for a particular value to determine which partitions with can choose for that particular value size, you can see when I said execute for Valley, 1500 it determined that it does not
require two out of the four positions that were in the tub party back table. So you are special in the plants remove to satisfy those particular criteria that I have given. This is the actual execution is the second type of current temperature, including occurred. This happens when you're using a nested Loop joints. So I'm going to. And petition case, I would behave. So I have got two tables, one condemned 6000 route and another country's 4,000 troops. But there are only two thousand Euros which type of spider has joined condition which is called him one equal, two column to
in the second table. So this one is it Loops 6000 * sin II, able to check arrived to perform the joint and the time ticket is about 15 milliseconds. Similarly, but when I do it in a petition case, in case I have, you can see her first. The table table has got 6,000 roast, and rice pilaf or six thousand times, but it checks in the petitioner that it satisfies. The given values that has come in from the L table. So I will be wonderfully, scan for 1500 times tables. Never scan because of the values in
the table satisfied, the rules of the stable. So it did not bother scanning. So, and I just mention my never executed, which means that despite the opposition was never scanned in this particular query and then peed 300 times because it grows in the table satisfied. The condition of this petition, So in the smaller case, I was able to see a drop of about 30%. So this kind of goes increase as the number of rows are there trees and how it behaves, it depends on the kind of panties that you have.
Okay, so next, I'm going to be talking about partition, my joint this was again introduced in postgresql 11, and this also has its own diameter, which has enabled partition West join, but this is my default off. Because in many cases, it can now create lot of planning time can increase. So if you feel that for your particular case, it is important, you can't stress this on one restriction for to perform petition has joined his. That bought the tables in world has to be partition and a partition bounds of both of us should match. So if I say one petition is 1
to 20, 20 to 30 and 30 to 50, then would the tables has to have the same partition Bound To Live. Then it can perform this joint. and I just, Okay, so this is a case in which I have not been able to setting. So, in this case, I have tried to join the tables with 4,000 rules on a $40,000 and Twenty Thousand Rules for scans. All the positions of the first table and then it performs a has joined a by joining all the, all the rules of the second table and in this case is 37 milliseconds.
No but I enable the petition was joint which is equal to on. You can see that it performs the highest during the same type of joint that goes profound previously. But it will funds for each of the partition to Evelyn and Evelyn is drawing and then it went to Israel and this is repeated 483 MP3 and MP4 MP4. And you can see there's almost a 50% reduction and execution time. But football farming business for performing this kind of joint. Next is partition level, aggregation again, a feature of what's best for 11. This also has its own time by siedah Garrett
reaches of my default. So when you use Glue by, or any other get function, or which uses the partition key, then the other aggregate function is pushed down into the individual petition. I suppose it is not on the partition key. Then what happens if I should, I drink, if I should, I get this for fun and then you, do you combine it at the upper level? So it's supposed to foreign table in between then this aggregate functions of push down for the folding table. So in this case or
doesn't own petition case. So I have to perform a group of a function, rule back online and this will simply call it had fallen upon which was going to store for scan every partition and then perform the aggregate function Group by 7 milliseconds. And then when I said the diameter equal to on and then I perform the same, and then what has happened is that for this aggregate function has gone down into each partition so it has been performed for t, a p in one and then it has also been performed on the P2 and P3 know the
pain is like this command has been called on the partition key. You know that when you are in the position, you're going to get a eunuch. Final results. 20% migration time. So now this is the case, but I'm talking about where I'm not using the partition key which means that the roast of a particular value can exist in different positions song, my help because it provides partial partially checks of performs, this aggregate function within the partition Island. Finally finalized has a gate to combine whatever duplicate set finds. So as you can
see this to petition says, God value 8, so it performed the aggravation there for 10 and 5. And then finally, when it came out, it was able to combine those to create. One thing are equal to 15. Sold in postgresql 12. We had certain features to check out without the partition free information to get out from the petition that you have. So when you say p, d partition table display. The entire partition tree of in a table for my dead, nephew cpg, partition and sisters of a particular table.
Then it will display the ancestors of that particular position to the route and precipitation route with display. The topmost root of our petition tree. Yeah. So this is just an example of each Rose Turner table format, the petition tree and also gives you certain values. So, in this case of 11 is for the partition doors, does not currently have any partition to itself. It is a partition table and that's why it's Leaf. Column says, P1 is false, because it will have children. So I'm then the level specifies in which level that particular table is selective. 101 is Level 1 and
11 and 12 a level two. and if I don't feel like this is immediate parent of that particular table, COPD partition route. Like, if I see. Give me the PG, partition root of table Pete. Well, then it will say the most top most route which was given and suppose I want to check the ancestors of my current Table. Then if I like to just a Select Staff from PG partition and sisters and it gives me they were Saudis siding from Pete. Well then it went to be one and then if he went which was the route
So I would like to see that position. Does help a lot of scenarios but not in every every file command does not to come faster. So just know your data, know your system experiment and try to find the best Best Buy the best partition parameters for your system before you just don't blindly partition. So that is it and we are currently hiring so you can connect with action yet. That's awesome. Giving up any questions. I'm sorry. Does it come and see how Sean said, goodbye and
have it for the entire system? Far all the partisan. You said we have to create the child table explicitly, right? And upgrading all the partisans. Manually will be, something. Is there anything? Working on it. So not in the open-source code but has In the multi column party thing. How the protein I feel like if the First Column they in the Santa Clause does not contain the first volume of the second column. There are multiple columns so it is able to run by purchasing
walking range partitioning partitioning. Like if you have equal to something, then in addition case it would be able to print the petition but if you say column to equal to something then it will not be able to perform the partitioning properly. how do you partition an existing table between legs really like Table using petition by command and then you have to copy the data. Minor, you don't have direct, you cannot create because partition table cannot contain any data. So you must have a lot of
downtime of course. So you will have to create a partition table, perform the poppy and then go online because otherwise whatever you can do it. Excellent education. I would like to know how can you be able to more than one parent to parent? No, you cannot do that. If you, it has to be in her that you only one parent has, let me know if you're going to throw some it would you cannot you can I had to go since I bought this and using a key and joining with the other
normal jointly, okay. Then it would not help because it would basically be scanning the entire and everything will get joints. Actually, I'm not able to get a personal question for y'all saying, or depending it would help because the scans performed, a partition table would be determined by the rose on the on partition table. For table, will be joined with only child partition You mean the parent. What about foreign keys? I would appreciate if you would discuss with being offline because we have to move on with the next present. I'll be waiting a baby if you have any
Buy this talk
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.