Principal database engineer at Amazon Web Services, Co-Chair - PostgresConf US, Director - United States PostgreSQL, Co-organizer of Philly PUG & NYC PUG, former CTO at OpenSCG, former Chief Architect at EnterpriseDBView the profile
About the talk
So, what? I'm Jim, I do primarily work part of the RDS postgres team work closely with the Aurora post Chris team. But we're going to talk about today or just a pretty General use cases of migrating from Oracle to postgres. I've been doing Oracle the post-race migrations the most of my career first. First migration I did was almost twenty years ago things have changed a lot but these these top 10 kind of come up all the time and I've never met anybody that made all 10 mistakes and I've
never met any of anybody that made none of the mistakes. So just kind of Irma show hands who's who is done a Oracle to post Chris migration before. Who's in the middle of doing one right now? And I was thinking of doing one. Call. Okay, so don't be ashamed. Like if you made some of these mistakes everybody has right when you were in the earlier session before we're stalking about amazon.com, migrating, everything off of Oracle. Amazon.com made some of these mistakes to write everybody does. But project is to
kind of learn. Why did it usually happen? One of it is because there's a looming deadline, you have an oracle renewal coming up and your CFO say we don't want to pay that bill. Get off now. So you have a month that migrated that's not realistic. But people try another one is not getting the education that you need. Not learning enough about Post-Crescent how it works assuming that you know, relational databases you know oracle you could go ahead and just do it. You're seeing the world through an oracle Lentz of Oracle doesn't this way. So
why isn't postgres doing it this way? That's one of the major problems as people are. Moving it around. It's a different end. And another one is using their migration tools and some of the short cut, some of these mistakes are made by the migration tools themselves. Because the migration tools are there to help you not do a hundred percent of the work, it's there to make it to 80 or 90% of the work. So it has to make some assumptions. First one is system tuning. What time is there a translation guide
between the and it's a profile in the post, I know there's not and don't think you could use that as general guidelines but they're different parameters. There's a different engine about it. If you were Drive always driving a gasoline car and then all the sudden you got a diesel car it would drive about the same. But the first time you get to maintenance and went to go refuel, you go to a different pump. and then if you had other maintenance and you wanted to tune that car, when you go with a diesel engine, you're not going to go replace the
spark plugs under the covers under the hood. It's different even though from the surface, they might appear looking the same. So you tune things differently. Something like the SGA everybody said oh it's just like Jared buffer. It's not there's different things inside. For example, inside of the SGA you log buffer but postgres has a separate. You can't tune them. The same music as general guidelines of where you might look at and in order to tune the postgres left on file, but it's not the same. Hable spaces,
Oracle, you need stable spaces. It first thing you do is an oracle DBA. You start creating tablespaces, you use them so much that Oracle created features in order to automatically maintain the stable spaces. And if you want to create a table, you're going to put it all in one tablespoon and indexes are going to go to a different table space. You don't do that in postgres in postgres. Every file. Every index, every table. And every index has its own file. A table space is just simply a directory. So you get no benefits at all. Unless you're actually having
two faces, spend multiple Mount points in multiple disc. Lot of times, you might be connecting to a single large storage device and the benefits of tablespaces aren't there. There's no reason to add the extra complexity or actually make things like backups for punch breath. A lot more complicated, if you start using cables faces, and don't even think of using of it, unless you have extra discs in order to be able to do it. case folding, so Oracle default to all its meditated uppercase Postgres,
its all lowercase us pretty closely follows the standards. I told Auntie standards. This is one place that it doesn't. Standard size that matters day or she all in uppercase o streets in lowercase. So dorsa migration tool to do this, and you might have a case where you're trying to make everything all uppercase, what all looks the same. You could do that by putting double quotes around it. It becomes extremely tedious because you have to double coat, everything forever. Having like first name, middle name, last name and users all double
quote, it just becomes error from some people have a lot of application code. That depends on the case of that. Make the investment early on to make that case insensitive and make that switch the application. You will always run into bugs. If you try to force this thing to be uppercase on the database. Go. For the duel table. Oracle needs to dual table whenever you're trying to get Alexis date from duel. In order to get that, you're always have a monitoring Solution. That's
doing select one from Dual, that's always paying the database used all over the place, find it in the code everywhere. So what a lot of people try to do is mock up the Dual table, I'll create a table called duel, with a one calling called dummy, and insert one row in there with an X. And then you could have select current date from Google. Cool. Had one customer years ago that did this. And somebody came along and didn't insert into duel. You didn't realize that was happening, somebody did that? And all of a sudden, everything started breaking. Somebody start doing Celexa State current
date from Dual and they start getting back to Rose instead of one. It took us a good couple hours to figure out that that was happening. Reality is postgres. The fun Clause is optional. You don't need a dual. You could just a select. Use the Dual table as something to search for in all your codes. For some Oracle specific code and remove that. It is one of those great keywords to find out where your code is Oracle specific, and not using just standard sequel.
but if you do need to, In a thousand places and it'd be too tedious, don't do it as a table to view. So this way you don't have that possibility of inserting a row in there, is that you could still if somebody could come along and change that view and make it return to Rose, or do something else to it, but if you have to do it as if you Synonyms postgres doesn't have synonyms. Heard a bunch of times but has never seen him. So I can't migrate my application that depend on
synonyms. All my tables are in different schemas and I don't have them fully qualified. Okay. Cool. But if you're there for convenience so you can do. Select start on a set of sweat star from Scott. M. Facebook, that was really cool. Search past acts a lot, like a directory path on your operating system command prompt. So if you wanted, if you had multiple things, see how to kill moles on your system, whatever shows up in the past first, that's the one that's going to run.
set that works in a similar way inside of your database across Your search Beth is going to set it out of a list of schemas in the first object that finds a long-ass. Earthbath going to return without fully qualifying same way in your operating system, you don't have to put the fully qualified aim of java. Exe. You could just say Java just find it search pass works the same way. So you get these three functions, what are the other cool things? Postgres has multiple procedural languages. One of them being sick. So if you
have a very simple procedure or function, that's really just a simple select statement, you could use it as a sequel function post. Wrestle actually treat that more like a macro, it'll do it at part-time and injected into the overall query, a lot faster than doing it at execution time and do. User one. Get in there just as a select one user to. Get in there cuz it's like two in public. Get number returns 3.14. The public steamer is usually they're inside of everybody. Searchpath if you put something there to be visible to everyone
like raiding a public synonym. But by the fault, if we do, select get it in order to run that function. Not found the default search. Pastor postgres is dollar user. Public, public, so it doesn't find it. So we can start our search path. Search Pat is a user set up. Both parameter that you could set in your session, you could set it at the user level two data base level or globally in your configuration file. Or we could say user one user to public. So now we could do
select get in and return one. then we could just simply change her search bath now to user-to-user one public run, the same thing again and now we get to Just simply changing that changes, which function, that runs. I need to change out whenever we want, it's pretty straightforward. MN select. Get number. Since it's in the Public's key Matthews is 3.14. So a lot of the uses for synonyms. A lot easier just using searchpath you don't have to create a synonym for every one of your objects. You just set your
search path based on the scheme is that you care about We'll take questions at the end because we have a lot of slides. Sono. First person, Oracle handle knows differently. Express actually follows the standard Oracle not so much, right? But you have to account for him. This is one of those places that's more subtle that's harder to find that you have to do. A lot of testing around in order to be able to see you, mostly see it around string concatenation The Crux of it is. Oracle treat smell is equivalent to empty string in
postgres. So if we had our people table with first name, middle name, last name, and we wanted to concatenate that together for a full name. Concatenate, those strings together in Oracle will get 10. Famous people going from Elvis Presley to Bill Gates Albert, Einstein. No, we could take that exact same select statement and put it into postgres. Now, all we get is John F, Kennedy, and Martin Luther King. Everybody else has no. Is it had no middle name in
postgres whenever you can cat and eat anything with a no you get a no. That's where you going to see it most often. So what you should be doing whenever your change that is you should be coalescing. Everything. So this way equivalent in postgres of nvl. Postgres does have a new extension, their call Dora fce, that adds a bunch of Oracle built-in functions. Like in DL if you wanted to stick with that but you're much better off going using native post. I sent a story in migrations. But even better postgres
has some built-in functions. That handle all those notes properly, it make it even easier. There's a fan cat with separator. Function, that you could give it a list of things and it'll automatically do the right thing for you. If you want to build all that up, makes it a lot simpler as you're doing the migration, since you had the right to change them code there. Anyway. Go to some of the Native first person text. This is the one of the ones where you don't notice it right away and you might actually be out in production for a couple months before you bump into
this one. Nose with unique constraint. So, you can create a unique index on a table with an olive account. So we have recruiting it on table called Knoll test. Would calling one calling to see what is 52. Snorkel will insert a new one and then the letter A no problem, then we could just in to see one. Search the value one and it's oc2 is no and we try to do that again. We get a unique and straight violation. Postgres create the same index on C1 and C2. same thing, we answer one in a In wanting to
know. Then another one in another, no. Postgres know. This is not equal. No. So it doesn't violate the unique and shrinks to know that you're not equal one another. So when you, if you have a unique constraint on an olive, what column postgres will accept many know, where is Oracle only except one? So this is something that it's usually somebody didn't think of their table design it right? But it's kind of been there for years, but it's something to be able to look at as your migrating things over to postgres, if you
need to be able to keep everything the same. You have to create special constraints on here to check from though, because lo and behold you're going to get duplicates in there when you never got him before, which is why you might not see them. When you first go out to production, it might be a rare thing that might happen. Once a quarter, once a year and lo and behold you going to get duplicates when you never got it before his post-race accepted, this is actually a standard the way it should be done. Oracle handles it differently.
So fine, tuning Court. Had an index in warchol, and I just copy my query over those purses, not using that index. I had a hint there. Why not? Part of it is host. Chris doesn't have hints. It just treats Oracle hints. His comments. It does have hint available through an extension cord. PG, hint, plan. But those hints are different than Oracle. Hence the operations that postgres does under the covers as the executor does is different than work. So your hands are going
to be different The optimizer is a different Optimizer than Oracle. You have some similarities as both, a cost-based Optimizer, but things are still a little different other things is I didn't answer in index it on Oracle. I didn't need it on Oracle spine. Why would I put an index? On postgres. This post has a lot more index types in Oracle that you have a lot more options available to you should have just a b tree index. You have a hash index of Jenna, just a Brenda bloom, a rum index. And
there's many, many more that are there is extensions. There's different things that you could end exit on postgres that you could never do on Oracle. So there's a lot of situations where you can make postgres, perform much better than Oracle does by using some of the specialized index types. What about if you were in all legs, full text search Talk yesterday, mention trigrams. You might have to be looking something up with a light queries, it's always going to be slow there in Oracle. because you a wild-card before and after I'm scratching to create an index
on it and it'll use the index with a like or and I like it uses that gin index in order to be able to look that up making your query lot faster. One of the biggest things that I see is a mistake, it's not leveraging the native features of postgres when you do in the migration. People are scared or nervous. It's a big change. So they wanted to be just like were closer getting there. I had to my store procedure written this way in in Oracle, I wanted written exactly the same way in
postgres. Every single time I saw somebody be that Richard, it has to be identical. The migration project fails. Leverage The Power of postgres as you're moving, there might be some places in your code if you need to be able to run in Oracle in postgres that you have some if statements if work will do this, if postgres do that, that's okay. Because if you try to make postgres act like Oracle, the Project's going to hell, you're not going to get the performance that you need out of her. And it's going to be really difficult to maintain
because you're not doing native things inside the database. ISO leverage, what is there in order to be able to get the full power out of that has a lot of really cool features like different store procedure languages, you know, how to do everything and feel PGC Born date of rappers. You could do different connection to out to different. Different database systems or not even databases and making rest calls different data types. We're having the Oracle Oracle spatial when you go to post you, I asked some of those things just make it so much easier if you use the special. I
Or we hurt her. We heard Markham at mentioned earlier about the specialized data types around, say I night or cider. Did a a migration couple years ago where customers collecting a lot of IP address information. And they had these Oracle packages that wasn't the lines of Oracle packages in order convert things from bar chart has to be restoring. All the IP addresses is Bartos and doing arithmetic on that in order to be able to calculate whether or not things are inside of a subnet are not things like a 9 at data type. It allows us to do
simple things in the where Clause by using specialized operators to find out. Is this hosting that subnet Online. So when we were doing that migration years ago, In order to convert those thousands of lines of package code. In Oracle, I hit delete all natively. But the customer was a little nervous there, like, we spent months and months writing all this code. We need to convert it over. This made it go a lot faster. The project actually finished on time and I perform better because it
was made of inside the database. Don't use it convert everything over to post or send text, you'll never finish the project but around specialized areas were postgres has a unique Advantage. That'll save you time to migration project don't be afraid to use some of that. Specialized thing, special around things, like data types, and stored procedure language is another things like that, leverage, some of that power, that's there. so exceptions, most people making this mistake,
Oracle just about every store. Procedure has an exception handle in there. And all the migration tool to just migrate them. You have this function get first name? Write that all returned. The first name from The People table and we have an exception in there of when no data found return. No. When the all the tools convert this over to postgres, this is what we get to one of the reasons why so many people migrate from Oracle to postgres. Look, it's just really the header drink. Postgres itself,
it's stored. Procedure language is very similar to Oracle. So things like this, just work. But it'll pull over everything. Excluding the exception that's there. But reality, under the covers inside, a plpg sequel, or postgres is doing in order to have that exception their uses sub transactions. So not even going into that exception black, just a present of that exception Block in their store procedure itself causes a performance, it a measurable one because what? That translates under the covers is this pseudocode, hear it does. It's a
point of a hidden safe Point, it'll run that and then it's going to check if there's an exception and if there's an exception is going to roll back, that that's a point and then do whatever logic you have your own except for black otherwise, it's been really, really, really expensive to do. Especially if you have a lot of stored procedure code and you have nice tits or procedures, You will measure the performance of having these exceptions. Reality, most of the exception blocks inside a post restaurant necessary. You may have needed them in Oracle
postgres, behaves differently. So here we just run that without exception block and we ran this or get first name of gate with a capital G. We get back bill. Engage with a lowercase G. We get back now. No exception handling there. I just returned. No. The truth of it is it's not all the Oracle exceptions are postgres exceptions. So if we change it a little bit, we did out of it. Exception Block in there. We return back not fat. We run the first name, we still getting the no. We're not getting the not found. because in postgres,
in order to get that work alike behavior in order to throw this acceptance with no data found or too many rows, you have to do is Select into strict in order to throw those exceptions, if you don't have that key word strict in your select into It'll never go into that caption block. You're paying the performance overhead of having that exception block there and never using it. I see people making this one all the time. Then we can finally get the key not found when we have that select into strict.
Everybody makes this mistake. Choosing the wrong data types. Do in Oracle, you have a few Min data types, your business application is you're going to have our tools, you can have dates to get us some numbers. You might have a couple clubs and blogs but that's about it some other day to tell you this, but just about every application has these that's really what you use. Postgres has 64 Bay stated types when you just install it out of the box and it can be extended for a lot more. I do a tutorial about data type is a 3-hour long
tutorial on postgres data types. It's not something you can cover and a half hour. Post brexit, extensive use of data types and it allows developers to be able to choose the right thing for their application. Instead of using generic data types, and having to figure out under the covers puts the onus on the Developers, Prince and the perceived equivalent, like what we've mentioned in the beginning about the configuration, the same thing applies to data types. Those perceived equivalent aren't always the same. All the migration tool to there, couldn't move a club
to a text Field in postgres. In Oracle, you handle a club special if you wanted to get the length of a club, you have to use the dbms packages in order to be able to get that. Postgres. A bar chart and a text or the same thing. They really are built upon the same thing, you could have a 1GB bar chart. So you don't have to move everything over to text in front of in fact, text is actually faster than Borchers. You want the most performance may call your your farts are two Fields text. Probably don't want to do that because you need some, some constraints on their bar chart
100 or whatever, but it's not the same thing that equivalent where in Oracle a varchar2 is different than a club host price of arch are in a text or the same so we could have a table that it's a large jar, 4001 or of arch are in 10 million in law, whatever you actually need. And we could insert into a bar chart. Here were inserting the letter x a billion times. Into a bar chart. And we just use the same postgres barchart functions in order to be able to meet you manipulate. That one gigabyte, it's just give me the length. There's no special handling
going to work. So when your migrating things that are clubs don't necessarily say, it has to be a text field, you could be of our truck. Because Barn chores in postgres are bigger than what the oracles parties are. You might be able to just put in a regular one. Even put consumption trace on there and improve your coat. This is the, the one that gets everybody. Oracle has the number field, all your primary keys are number cuz of the number 38. Get a lot of
nights there. Everything moves that to a numeric. I need to get a numeric 38. but if you get bigger, Much bigger, much, much bigger than a number 38. Back to go up to a hundred and Thirty-One thousand digits. You could put any Oracle number into a postgres numeric. It is not the same data type. It's not equivalent putting an oracle number into a numeric is like putting an email address into a club. It is overkill for most situations. It is an extremely precise datatype. If you need to store things like money
or financial data, that's what numeric was built for. Not your primary keys. Because it has that much precision and scale. And you get a pic performance at 4. That so because all the tools migrate, everything over their primary keys, People just say Ed numeric. Everything works great. Because it'll take everything in the migration tools. Take it in because no matter what, or call number you have, it's going to go into numeric and you'll never get in there. And what happens is people do that and then they get to the end of the project
to start performance testing and they wonder why is postgres 40% slower than Oracle and using a lot more CPU. It's because all of their primary keys. I said it numeric. and then when they have to go back and change that, it's an altar table, which requires rewriting the entire table, and then have to go back and change all their Java code or python code to handle a nurtures instead of numerics. So here's a simple tasks, we have two tables, see one that has one, that's numeric and see to that has two columns that are numeric and we just
generated a bunch of code. The million roses in the other one in ten million Rose in the other ones have it today. So we could do a join between the two for every Row in and see when we get 10 rows and T2. So we just do a simple, select count star of joining those two. Get together and I my laptop over here in 2.7 seconds. Do the same thing as big ants as integers could even done it as integers instead of bigots. Generate the same thing. 21.9 seconds, 40% performance difference by having the Yorkies all as integers instead of numerics.
You're getting that 40% performance, hit on every single primary key, look up. If you have them as me marks, even some of the some of the like tools and everything might create a like number feel that doesn't Alias over. Numerics. You're still paying that performance. It it's like this is the biggest source of performance hits as your migrating over to post. It's not that post rest can't perform if that the migration was done wrong with the wrong data types. when you're doing your migration from Oracle to postgres, Spend all your time on the data
model upfront. It will save you days weeks months of time at the end. If you spend that extra couple days getting all the data types, right in the beginning, laying the foundation of the house down to this way, everything else is built strong. On top of it, you'll get an application that will perform just as well as Oracle if not better. If you make the right choices because postgres is different and it's gives the power to the developers and you have to make the right choices. You know what your data is. So make the right choices of the datatype choosing between an
integer, a big and small ant choosing the right time Stanfield and date fields. Choosing inet cider. Are those things using a bully in field instead of a bar chart, one makes a big difference in all adds up. If you make those right choices of his data types, they one of your migration you'll have one that's going to be able to perform as well as Oracle. Chantelle. Marie top 10. Again, I've always found. Nobody's ever made all 10 of those mistakes and nobody's made. None of the mistakes
tuning the system. You have to learn postgres in order to be able to do it properly. Don't fall down into the traps of thinking that everything has to be like Oracle of using all the tables faces and using all the features the same way don't don't get tracked. Wigs by mocking up Oracle like the Dual table and creating all these things. In order to be able to make it a little bit easier, going to lead the bugs down the road and when you're thinking about performance, spend the time early on on those data types and do the design stuff early on, it makes
a huge difference. Thank you. Jim, I have a credit. So great. If you have done this thing, for some of our work one specific credit. If I attempt migration with Oracle either using the UPS are you think it was like one or two pg52? Also certify that the fact that there's an embassy has four including for the Oracle database by default in the end? Yes or no. When I ask them to migrate, one of these shoes are facing that goes beyond 65535, the moment I do a mapping into postgresql, we talked about working on today.
No, cuz I didn't hear everything you said but we can talk offline. Search for the function, using the difference between regular and whatever first search path. So you can have that same function that we had get in, right, whatever your search path is, is if your search fast as user one, user to, it's going to run the user one version. If it's user to user one, it'll run the user to version of it there. There, it just ate a precedence of the search path of the As much as you can. When do you
Stacks as much as you can? If you do, you have string data, that doesn't have a limit on there. So if you don't even have to do, like a bar chart 100 or varsha 1000, it's just the varge. Are you just text it actually faster. Questions. Okay, thank you. Thank you. Jim.
Buy this talk
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.