About the talk
PostgreSQL is our go-to relational database for all new applications and also target for migrations. As we are also involved in migration processes, we have chosen to share our processes, the joys (and tears) of our experiences, what worked well and not so well. To make our lives easier we’ve decided to work and contribute on tooling associated with migrations, most notably ora2pg and our own code2pg.
We would be discussing on 4 key phases of migrations
Finalization of migration/Due diligence:Analyzing the app Architecture , Appropriate server resources ,Does really DB size matters, up to what extent, Specific features used.
Journey : Challenges with hierarchical queries, Handling explicit commit/rollback ,how to handle DBMS packages, Handling Global variables, Handling Autonomous transaction, String_aggregate with condition, How is LOB to be handled, Where does data types impact the applications, and what should application team know. Dynamic partitions (constraints,creating partition),Oracle_fdw,NON-ANSI JOIN to ANSI JOIN and so on.
Technical Architect associated with Societe Generale almost for 8 years and with overall 12+ years of experience in Database management, Architecting different solutions on database and applications requirement. Gaining experience and moving towards the trending open source, currently pursuing the enthusiasm in opensource solutions such as PostgreSQL,MongoDB,Cassandra, also with experience on Hadoop,AWSView the profile
Thank you very much for the introduction. How many people are in America? I'm sure everybody had a good lunch. It'll OK, Google. I represent Society General. Some of you who has not heard about this company called Society Central. It's a European largest financial groups at the French Bank Fox. You can look like elitist. We have a business in 67 countries, Just want to quickly improve, as you can see, our core business or in the three pillars. Which is the French retail banking. The ibf's with the color test and the TV is a frequency.
We have a presence in 67 countries. I'll do we have a hundred and forty-nine thousand employees working for the bank and here in JC what we call this is global Solution Center in Bangalore and we also have a person. Can I wear the IT solutions are being handled here along with some of the operations for the group is the theme Spirits, Innovation, commitments and responsibilities, which lies our core responsibility of making the group better and better. As we progress would be wondering at what is Bank of doing in that
post gray. And what is the tell me about? Associated general is a chance. We can see if it's a techno Civic company. The number speaks its we have it. Budget of four billion. We have staff of more than 23 k. N i t l o o in the global other. We have more than four plus million apps getting downloaded and we have 1500 + API introduction and some of the apis up in developed internally by extra crew and most probably in Jessie, So that's setting the context of all the company, which is why do you think we are part of
it? And we are proud and most both of us have spent more than nine years and the company still going strong. So what do we do in open Sous? Is that this is the three pillars that we have. As a matter of fact that we use contribute, an attract, what we call it as uses, like to think of what open source is required for any solution that we are thinking for the business and even further business to suggest that we always urged them to think open source could be
anything. That's not necessarily database but also the Middle where the schedule of the monitoring the API application development. So, People to think open-source first, that's the use pillar. And then in case, if there is any alternator that is required within again, think of open-source first and then if they open to stop, casting to the business need. Then we ask them to think of another solution and then we have the support which is the most important we can. For the people who are
performing their operations are using the it to support us something, which is very important. So first we think these supposed to try to feed to be in place before we do anything in the company. In terms of contribution, what are the values? We want to contribute back to the community. So what we are going to see is how we have been contributing back to the community. We don't normally use the community version of different open source Solutions, but we also giving back to the community. So what we have done that and
of course, I'm being attracted to something like you know what the point of using in the contributing we need to have more customers back on to us. So what we do, we participate in these kind of evens and communicate to the business and give them how they can utilize the open source Solutions which is either in the house or that's available in the community for serving Tempest Meats. Okay. So this is the part of a group. Anthony, who is our principal
engineer is based in Paris. So I'm showcasing only those people for related to this post grade who has done some contributions. So you can see, these people ain't never Fork online and get help Society. Gentle Anthony has been the creator of go to PG which is the Solution for estimating. Four on the court at the end of the sequels, which out there to migrate to post Gray. The blanket who is joining along with me, will be the next speaker to give you all the technical
insights about our migration. And if we have to take and the kids were being developed as a contributing to our hpg in court, with you, tell the truth that you have been using. Okay. So what's our migration processes and to give you a little bit about the Stylistics So some of the Tamil Nadu people be able to relate This Acronym s e t c. E is what we do because of Crucible study on the applications. Today morning, the keynote Santander that he was saying like, you
know, 30% of the applications are being migrated to post where are open source. So interested in taking on 30% of the migrated from different vendor Footprints to go to open. So first we do the study, we perform the feasibility of moving to open source whether the open source of capital into the same kind of features. Same kind of business risk that we have is all be mitigated by the open door solutions that we have. And then we performed estimation who's the key. How long will does it take? What
are the cost and roll to all these things? Becomes a part of a study and then we move on to the execution with my greater than Pump It Up, stick similar Space, Odyssey process. And then we do train because a lot of application team who are new to apple juice. The guys may or may not know how to support the application pumped immigration. So we give them on the job training and of course, the documentation, which is a part of it and then last but not least is the caring of what about application that your microwave to do open to, we support them
and we also bug fixes and then turn the performance. If the all these things are part of what I leave the team called omf, which is open source, migration Factory. So we are physically a team of 10 people who does the migration for the entire group. + 225 + database has migrated from different Technologies which is the most sequel icicle Cyprus as well as databases instances which we have in 5 databases is not going into the critical mass of the application. So things which
are medium and are able to move to be open to both of the kind of databases that we have taken and brought about our time to Market. What is the record of time to Market? Not sure if many people understand that, it's time to go to the production, okay? That's around 2 to 12 weeks, is what we have taken in terms of the kind of critical databases that free ticket and River migration. We as you can see the delivery time it's almost being hundred percent of the data. Using a tool, which could probably
do the migration quicker and faster in terms of recording. As part of any of the migration, migration is not an easy task, as some of you would agree, it's a challenge in itself, that's what you're going to go through. While we perform the coding does. A challenge that is involved where the logic needs to be understood. You may not have the same kind of functions what was written in the original Windows software, to have the equivalent straightforward conversion in the oven. So so we
have to tweak it. So sometimes the timelines do skip. That's why the 70% in the coating. In terms of applications, the kind of the heaviness of the applications, we have taken a simple to complex. So we have more than a hundred K lines of code convulsions in those applications that we have in this context of to go much deeper in terms of the nuances of what is the experience that we have. I invited two blankets to take it Forward. About your experience and
how what are the challenges with this another applications that we have going to handle? Thank you. Thank you. Am I audible. Thank you. Thank you. Looking with the databases are like you know, but in the interest of time I need to rush the science quickly and any of the concepts are any of the scenarios. What I'm going to talk about. If you're not able to understand, then I'll be around today and tomorrow and you cannot cash it with me. Is it okay? To start
with, said, we have a double helping the organization organization to move and incorporate the open-source culture, okay? And one such initiative is the immigration migration Factory. And what I'm going to talk about here is it is going to be a tip of the iceberg because you know, we we we took only certain scenarios which we thought that it might interest you in terms of Enoteca ways. So that anybody who is working on the migration front can use the experience what we had and Implement in their own environment. How many of you are currently working on migration
project? Well that's that's a good number for your work about migrations what I would like to tell you about the traditional Legacy, installations used to happen in the in the infrastructure. So have you heard about that? How many of you are aware of that? Okay. So, you know, it would address me if I can, add some information to, it is going to be the de facto model. When you move to Cloud. Okay. That's as far as anything as a service, okay? When anything as a service, when we say, then, why do we leave for apart? So, that's why
we have a, we have planned ourselves, and we have started to integrate poster into our last Model and we provide the first day as a service to the internal customers, what we have on our private Cloud. If you check the model, the cloud model, what we have is a particular model which is the hybrid, it includes the private and the public cloud and they're supposed to look up with this part of the private Cloud offering but we have private Cloud offering Okay.
Then moving to our experiences. What we had a we have been working since passed. One of your two two two years with respect to the migration to started with Oracle and then slowly slowly very pinned up with the sequel server and MySQL and Cypress migration. These are some of the things he thought, like, you know it would add value. If you talk about the experience of what we had someone to share about these things, To start with we were actually trying
to do and this particular application was using the encryption function which is provided by the sequel server. The data into Upholstery and provides a similar kind of architecture to the application. Okay, know, when we initially did that, we extracted the data we loaded the data into PA's grave, then obviously you know, when you, when you extract the data, which is already and Cryptid, it doesn't make sense for the application team to use it. And when the, when the, when the data was encrypted
using SQL Server, encryption algorithm flight, 880 departed, logical a plane text or data, and then it'll be implemented. The PG crypto extension on both, okay? Then we Define the key so that it can enter the data load the data, and then it would help to extract the data encryption extension. That is the PGC encrypt and decrypt. Okay, so this was quite helpful for us to integrate the architecture, do the same architecture, which was available with the sequel server and provided. That
was one of the one of the foundation laying migration what we did because we're depending on the outcome of this migration. Yep, make sense. Then every wnmu, I hope, you know how useful it is, right? And we also in explore the different options to to migrate data. And we also compare like which of the approach would be helpful and which would be faster for us to complete the migration night. When one such an application requirement, it was actually, you know, it had two different
impacts to different application and they were designing one application where they wanted the date of these two applications to be available locally on Bosque. Thanks. So still do two applications are in Oregon and the third one which they are creating. They wanted to be in postre. We try to use the strategy and bring the data locally in to post any to 10 strains and then discarded. Do you know of a VW would be the best approach in in the storms? Okay, so what were they created to
required? To configure a VW have to do that, okay, before and tables and then on that for a portable speaker, About that makes fairly simple. The approach of bringing the data from two different, the Oracle databases and post it on the porch. Luckily, no one of the catchy wind or one of the significant Improvement. What the application team formed found was before this kind of approach they use to, you know, they hired another Oracle database, which they wanted to
access these two data, Mason everyday, every night, Armand job used to run with used to bring down the TV station, where I read every night that used to happen and the whole of the process used to take some 2 and 1/2 to 3 hours and what does materialized view approach the, the whole of the data was able to bring it to locally in 1 hour time That was the significant achievement. What what we had with this kind of approach. And here you see the steps. What we had used to create the extension, the
materialized view. How do you refresh your approach would help you if you have the same kind of scenario? The only challenge would be, you need to develop the bad job if you want to know something, if you have 300-400 tables to refresh every night and you create a job with the scheduler, Yeah. Then anybody face challenges with the object? You know, when you work with the completely over those are there a lot of challenges when you when you have to deal with lob
object and once a challenge we faced was like in the Hall of the database was 300 GB. It is not that large. Can, you know, that can put you into a lot of problem, we had some kind of problem and when initially we went with the traditional approach of extracting the data it almost ran for 72 hours for 3 days. While the application was ready to do, the migration only during the time of 2013 has to be, if I have to do a migration for 72 hours station, wouldn't get ready to move to buy for free. So, we have to wear over Define approached by. How do we do it? And how do
we shrink the whole of the migration time with another permissible limit of 12 oz? This, if you see, we have to apply a strategy with brainstorming and apply the strategy, like 300-400 tables were there. So, we have to segregate, according to the sizes and it'll be tables. We completely sorted that out and the rest of the tables, which, where large table that was put into another lot. Small were put into another lot of strategy. And we're on a 10 panel
and a table extraction. Nobody did that if you see this order to convict parameter parameter, actually helped us to be extraction and the bring down the extraction to attend has time and doing the other activities. We were able to achieve the goal of the migration between 11 to 12 oz. No by tier 2 yd, anybody. You don't have any time Y t. Okay, so some of the, some of the software on my application migratory to change the datatype of my column, right to one such application we had where, you know, any in
Oracle, it was the lob and when, when they move to a post that said that, no, no, I need why they call him for it. And it was a very big challenge for us when we wanted to convert the datatype from ATL to ord. Because when you use a ride to get converted to buy Tia what what we go through what are too busy putting Post in all the tables as by T only okay then we created another table so that we can expect By the way I do already called him. So when we were we were, we were getting
which indicated that when you're trying to insert the data and not allowing you to do it in a direct way. So you have to have another matter to implement desk. So that mattered was something like, you know, you have to create a function which is something I clobbered us to write, which, you know, but we have to do without any internet research blogs. We had to follow up with several people and then we came across this function. What we had to write and Implement and then find a cast / 8, know what approach that whenever you know, you insert the
records into the table, which has a weighty internally that is managed and you don't I don't know, but if you will have the data ultimately loaded into yd And this this was this was a game, you know, I kind of requirement for many of the applications because this is very peculiar. As I said is more to the third-party software if you if you are using any right. No Dynamic partitions. Anybody they would have got this, the situation of creating the partitions and there are several
methods to implemented. But ultimately, what happens is the application team that, you know, they don't manage the partitions every year and they would request you that, I don't want to manage the partitions and whenever I load the data recording to my condition beat Ranger, repeat list, the petition should get created automatically So that was one that this was one, such a scenario we had and it was a 9.6, the latest versions have more advanced. Because when you want to implement the petition, right? So here we
implemented. The Inheritance approach about it ahead of time. Is a brooch of moving the petitions from Oregon to post and the baby has died in Oracle. What would you do with disable? The partition, when we were extracting, the data from poultry, temperatures, for disable partition. If you disable it, then hold of the data of the partitions into one single table and you extracted, right? The same table, we are going to load into post. You'll have the whole data into about the table and they're you start to manipulated to create the partitions.
So this is one such approach about what we followed do, you know, to have the petitions created. This is the initial setup and knotted going to be dynamic, Creations trip. Now this is the normal approach. What do you follow to get the data from our ability to partition tables in the upholstery? Okay. Then once the partitions are created, how do you make a dynamic? The dynamic partition creation function is something like this. So you can also use this kind of function to create based on the partitioning condition. What do you have in Oracle? So, you may need to modify this to
create the partitions, according to your requirement. Okay. Then once you create the function of useful, you know, you have to create a trigger on the table at the function will not get called every time, right? So you create a trigger on it. The moment you inserted the trigger will call the function, which is that you created for partition Dynamic partition creation and what that function will do is to check, the partition is already existing existing, it will create one and then it will load the data into it.
And I love you. Yes, yes, a place a new body, it checks whether a partition is already, there are not based on your condition, maybe, you know, your body stingrays on the Range. Addison, based on the list. I mean, you might give us a date, right? So when you insert the data, you will obviously insert according to the deed condition, right? So when you give the date condition, it will check whether the partition is existing. If it is existing at the load the data to create the partition and then loaded it. If you see the highlighted, but that was
a test regulations. I cannot put the real data here so these are all test data is okay. If you see the highlighted won the new Partition with 204. The department will also do that. But, you know, it is an extension and we had challenges to install it because of some rules and regulations. In any extension with, you need to install, it has to pass through some Securities in order to implement that for us. So that way, you know, It was instant, there was no performance degradation at all after we created that but I would say that the maximum
test what we did was tell a cruise. What if you have more than you may need to have a performance tuning approach applied to it? You did that like why transfer the data without the partitions and then, you know, and then it was taking a lot of time. Also, when, when you put Avenue and Abel that by default, it has enabled, you know, if you enable that every partition, it starts to read it every 10 states that starts with creating a different file, single file, increase multiple files. So if you have like ten thousand,
20 thousand years, a huge load number of files and every file it has two. It has to go in, right? So that that's why, you know, if he wanted to eliminate that and they provide a single, are you and what? We observed, the lake. But this would have been offline migration retirement in Downey. Dr. You know he that you do it only during the production only during the operation, but normally you do it, when the application is life is online. That's the thing.
Okay. ETL jobs are jobs for any of the applications applications. What is a small file loading every day they do some of the other loading right now. In some cases you have a job which is more prominent which is more prominent, okay? And then what happens is like more specifically if you have a commercially available, ATL soft face and you're using it, that could create lot of problems when you are migrating to a different environment, okay? So why challenge for us work, this case, okay. Where are the migration was a normal requirement for us. We did the migration, migration was very
successful. All the boarding part which has migrated everything else was done and the four most requirement to have any teacher job. Okay? And that is around 20. Okay, or nautical and they wanted the same performance, because of the business reasons, they cannot accommodate more than more than 30 minutes of time. That was the next, which they would be allowing us. Okay, no. When we had this requirement and would be migrated their table and they were trying to run the 80s, it was almost taking 6 oz.
So then we started to dig it and then, you know, we we found that there are different. They could be different approaches water. You know what we can Implement rather than using the commercially available in the software which is taking a lot of time for us. Then what we do? You don't we try to test the approaches, something live PD loader, we wrote Our Own script which, which would use the inserts. Okay? Then what we we, we divided the inserts into parallel jobs and ultimately we were able to achieve the whole of the job in 18 minutes.
Then they came up with the last requirement saying that in order to provide some discarded, if he doesn't get insulted provides the discarded Road and that specifically we needed, which was not available in our approach. So then I started and then we wrote something called a PG. Discard, it is not yet available in because there are certain I know I was testing the security level. Things have to be past so that way we can make it public. So with this, the whole of the migration
was successful, the reason for bringing this in front of Who is that if you have such kind of experiences that you have to be very careful when you are, you know, dealing with the scenarios of this kind. Okay. That other smaller when I get to what we say, what I would like to bring in front of you, is like when you do the migration migration data, migration, there are certain things which are very obvious and you need to make. Okay? Now Diva something like how that goes, what is how do you how does it work in Oracle? And how does it work in post a,
what's a more specifically? You'll have something going to be very busy in packages. When you might get a tractor packages, then you have to be very careful when Global variables are defined and how do you, how do you get the same kind of situation available in post today, it's something with the set of arable. As you know, a lesson Poe 311, you might have to wheel packages were available in Oracle has to go to the functions part. Right. So when you, when you have to achieve the same kind of global variables and function, you have to use something like a set and then you need to find
your own variables and values if you want. And then transactions more specific article provides a very easy method of black mold on a transaction. If somebody else use it right to to independently but it is not so easy to be implemented in the upholstery. So you need to have a concept which is what does the billing DVD, how do you use the develon kit? You can install for $400 apples, and loopback the deepening recession and covered the transactions within the session and you achieve the same kind of future world. Super whites. I
think this is the last one, I had what I wanted to share about our experiences. If you see the figures, they are, you know, they're there, when we migrated a query from article to a poultry related, very bad performance. This is an online statement, but it depends on the application. Application is return. The way the logic is the kind of experience what we had, when we had this experience, we changed it to with City, Club and deeper. And the EMD performance was almost 90%
improved. And then 30% like a upper or lower functions in the where clause in the predicate close. Then you have to be very careful the how the how the weather is getting manipulated how the data is actually stood in the column that impacts a lot. So then what we did when we were in Texas by 30% And then there was a huge Improvement in one of the logical changes in equity, which was well-handled in Oracle because the logic itself was in because you don't have a lot of loose happening in the in the in the
query itself because of the way they had mentioned the performance issue, we analyzed it and then we decided that no this is this logic is bad and we have to rewrite it. So we change the logic and there was a hundred and 20% of performance Improvement in the Quarry. And then one 76%, this, this is a normal basis of a great number because there was a very peculiar situation here. If you report back to the 80, it's connected to the same thing, when they installed the
third party and I don't want to name it, so when they installed it, they did not do it properly. You know, that was specifically production and we went into a into a situation where the issue of the performance, where the job was taking protein, has refused to complete it in one. One-hour 15-minutes in Oracle, if you strength for 13 hours and we were in the situation of untraceable from where the performance is being hit that we we, we might work to troubleshoot it. And then we decided that scenario in a
difference of it. And we weave, when we, when we get when we replicated, then we came to know that, you know, some of the old Driver links, we're not properly map. Are both quite fascinating and astonishing. You know, this kind of small issue can cause you can put you on your toes. Right. I'll key tools and go to BG, knocc out the application, migration reports which are required to solve this again for mssql nov integrated this part to order to PD that we are discussing with deals. Always be one of the four are too busy to integrate this into into the port.
It's convert. Do you know when you deployed, if you are hitting any issue, it is not hundred percent. If you any hit any issues, then you have to work on it to migrate broccoli. We need to create memory, but, you know, when you again, when I say that, when you deploy it, since it is an open source tool, what we observed is sometimes, you know, when it creates a foreign foreign key constraint like like a big statement is there, what happens? Is you, you get a keratin in your nose gets executed, when you try to deployed and there's also
some of the men you have a warranty relation establishes saying that your master, a child should have the 41st, right? And then the second one, you will have the child one. So if the child is not present in the master table, right? So you have to, when you get those kind of issues and you have to, you know, alignment and then executed I'm using money for a migration of in my sequel to post, Richland Oregon to go straight through, but data is my credit, but if you do not work, you can try probably
order to Pee Dee and I have given some contacts here. Maybe if you have some problems and you can approach it and it works very well. You know, what is the Departed? We are confident and we are still open that. You know, if somebody can contribute to improve it, And this is the sample report of Oracle sample. Report gets indicted for her also and it performs the estimation, it gives the HTML report of the source code. What you provide. It also gives her instructions that which line of the code, you need to go and change it. So I would say that it
works very well for the estimation, part order to 50% and give you a good thing and then country borders can commit to it. This is a game that sample statement of CO2, PG report. And this again, if you want to know more functionality of go to be available on public gate, you can go check and try it if you want to know more about that then and I see you can contact us in the given meal ideas. Yeah. Do we have any questions? What tool do you use for real estate, certification? What
we take care of the complete data, migration and the application for migration, but yeah, if you have a case, then we can discuss and we can see like how because it is a migration experiences. It all gets converted to functions, that is one parent. Have to specify in order to give my dog. An abortion is 9.6 or 10:00 it. Is it converts the code based on the ocean if you give 11 or 12 the new practices and procedures are good, to get converted into purposes and packages only
Buy this talk
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.