Duration 41:05
16+
Play
Video

RailsConf 2019 - The 30-Month Migration by Glenn Vanderburg

Glenn Vanderburg
VP of Engineering at First.io
  • Video
  • Table of contents
  • Video
RailsConf 2019
April 30, 2019, Minneapolis, USA
RailsConf 2019
Request Q&A
Video
RailsConf 2019 - The 30-Month Migration by Glenn Vanderburg
Available
In cart
Free
Free
Free
Free
Free
Free
Add to favorites
697
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speaker

Glenn Vanderburg
VP of Engineering at First.io

Glenn is an experienced technical leader, software architect, programmer, speaker, and writer. He has worked in consulting, product development, and management roles at small startups as well as established enterprises, and has given numerous talks on technical and management topics at conferences around the world.Glenn enjoys using technology to build excellent solutions to real problems (whether the technology is mature and reliable or cutting-edge) and cultivating great teams.Specialties: Distributed Team Management, Software Development Management, Software architecture, Ruby, Ruby on Rails, Clojure, Enterprise Architecture, Software Engineering, Test-Driven Development (TDD), Agile practices

View the profile

About the talk

RailsConf 2019 - The 30-Month Migration by Glenn Vanderburg


This talk describes how our team made deep changes to the data model of our production system over a period of 2.5 years.

Changing your data model is hard. Taking care of existing data requires caution. Exploring and testing possible solutions can be slow. Your new data model may require data completeness or correctness that hasn't been enforced for the existing data.

To manage the risk and minimize disruption to the product roadmap, we broke the effort into four stages, each with its own distinct challenges. I'll describe our rationale, process ... and the lessons we learned along the way.

Share

Thank you Sebastian. Good afternoon. I'm going to present a case study of a sustained multi-stage effort to change and simplify the core data relationship in a production system. Changing code has it challenges but changing the data model of production data is even more difficult protecting the existing data requires caution exploring and testing possible solutions can be slow your data model your new day tomorrow or may require date of completeness or correctness that hasn't been enforced for your existing data. But living with a data model that is a mismatch

for your business is also painful. It introduces extra complexity pitfalls where bugs can work poor performance all those kinds of things and we rarely hear about the details of this kind of change. So hopefully this talk will be helpful for those of you who have to tackle something like this. We did this working four stages over two and a half years at each stage. We tackled the biggest problem. We were facing driven by business considerations and then return to feature development for a while until the next big problem started to become

costlier to live with then to fix you can see that began with a short spike and then a 5 month gap before the real push began. Is 30 months at the beginning we had no idea what was ahead of us and we might not have started if we had known what it would involve but on the other hand looking back. It feels amazing and it feels like it's going quickly. We made a lot of progress and we're glad we are where we are now. There are quite a few technical details in this talk, but it's not about the technical details because I can pretty much

guarantee that you worth will be different. All four of these stages. The technical details were very different the focus about the overall strategy and how to do this responsibly. We tried to apply these principles at every stage and you can pay attention through the talk to how we did that. This whole thing started with a couple of big mistakes. So I'll start with those two set the rationale for all this work. We had first help real estate agents win more business by focusing their effort on the right people at the right time. The average realtor has a database of a few thousand

contacts and we use Predictive Analytics and other techniques to help agents Focus their marketing efforts wisely. This seems like a good time to mention that we are currently hiring rails Developers. Startup job is to try a bunch of things and figure out what doesn't work on the way to finding out what does work and we certainly did a lot of that and we when we first started looking at this problem our basic data model look like this, we had Realtors as customers and each of them had a bunch of contacts and we thought we started thinking about all the things we could do

with this day then so we thought it might be cool to know that Pat is married to lie or the Jane knows Robert and then it could be great to know that our customer. Abby also knows Clinton and Sally and that our customer bill also knows Kathy and Nancy and that let us do our first big mistake. And that in turn led to our second big mistake, which was to put all that data in a graph database now, I don't believe that's necessarily a mistake for everybody but it certainly is when you eventually figure out as we did that we weren't actually building a social graph. So I'm not going to talk

much about relationships between contacts today. Our Focus will be on relationships between realtor and their contacts. But remember that the other relationships are there and the notion of this as a social graph led us to thinking of these contact records as representing the actual people the actual Nancy for example, as opposed to just Abby's contact info for Abby are for Nancy or bills contact info for Nancy. And we made a lot of other mistakes too. And I don't want to get bogged down and wondering exactly how all those things happened. Just remember what Jerry Weinberg said and bear

in mind. Bear that in mind when I show you where we were two and a half years ago two and a half years ago. We had what we thought of as business data about users and some of their activity data and stored data in postgres and all the contacts and social grab stuff in neo4j and to link the users to the data in the 04 J. We had these what I'll call realtor Prime proxy objects in the neo4j All those see our nose or contact relationships neo4j reifies all relationships as nodes and its query language Cipher encourages you to think of them

that way but my guess is that you like me think more in terms of tables. So forgive convenience, let's draw it that way and you can see that contact relationships is a big joint table. And contacts were shared between Realtors and so we had to use the source information for every attribute names phone numbers email. So on to determine which information each customer was allowed to see we would attract that Source info anyway, but for other purposes and then we wouldn't ordinarily have used it to handle the data privacy

concerns for our customers, but it had to be included in every query because the contacts were shared between Realtors. Here's the structure we actually need and that we have finally achieve today note that shared contacts have now been duplicated and the joint table is gone. We didn't know for sure at the beginning that this is where we needed to end up. The big problem so big it obscured everything else was having our day to spread between two databases. So we started there and that's where this talk starts getting the data out of neo4j. What drove the change will

neo4j and Cypher the query language were not as familiar to the developers as postgres in SQL the active model gem for neo4j at least it at that time was less mature and feature-rich than active record. Neo4j is drivers were less mature and less well optimized and the entire database for that matter. It's just a newer product and some features which we eventually develop some features that required cross database joints and you can imagine how slow and memory-intensive those were so Rob's Anaheim and I sat down to start making a plan and we decided to

migrate people across realtor by realtor. We had an we started building an importer job that would import a Realtors neo4j data into postgres and that important needed to avoid duplicating the shared. Do you know that already been imported for another user and we would use a feature flag to indicate whether user had already had their data migrated over or not. We started with schema definition we knew our data in neo4j was Messi neo4j is referential integrity features are weaker than posterizes to begin with and our knowledge of those was

weaker than our knowledge of post-crisis features. So we weren't using them very well. And so we got very serious about data Integrity in this chemo used for in keys on Cascade delete and restrict decorations, check constraints, exclusion constraints anything we could think of to encode the the constraints are schemas are on our data into the database schema. This was enormous Lee helpful, the constraints caught most of our messy data during importer testing and forced us to figure out how to deal with it and clean it up right up front. The future flag

needed to be readily available throughout our entire codebase. And so we used middleware to set a thread local variable early in the request cycle. And then we had a problem a lot of queries start off by calling class methods on a Model class like those there. We needed that model class to be a neo4j model if the active record model if the current Realtors feature flag was set that their dad had already been migrated. We needed it to be a new fridge a model. If not, how do you do that? Well, Ruby's dynamism to the rescue. We were able to build models that can switch

back and forth based on the future flag has contacts for example. Contact with this very small class that extended a switching model module, which we'll get to in a minute and then it declared. Do you know I'm going to switch between contact be one in contact me to contact be one of the neo4j model contact me to was the active record model you can see contact be one includes neo4j active node declares that it's actual No Label name is contact and then has a bunch of code specific to that and contact me to extends application record declares that it's table name is

actually contacts not contact me to s and does a bunch of active record stuff. Switching model is a little module with all class methods switch between just stores those two models in class instance variables for later. Use there's a V2 mode query method that checks the thread local variable to see which road we should be in V2 is supposed to be one is neo4j you can see that there's an environmental environment variable based override of that feature flag there. We use that in testing. And a switch method that simply returned either the V2 method

or the V1 method depending on what the mode of the current user was and then everything else is delegation with mitt method missing cost missing and new. We just delegate those to the object returned from switch and away we go. So most of the code did not need to know that there were these two different kinds of models of the controllers could just use the models then and get the data that for that they needed for that use. When we started this work, a lot of the queries included Cipher fragments

and we we found that it helped a lot to convert those into scopes. So that we could mirror those with active record Scopes. And again, the idea was that model are controllers wouldn't have to do things differently depending on where the date it was coming from. They would just call you no contact contact. Follow. You know Nann realtors in and so forth in the Scopes would hide all of the database dependencies building a rich vocabulary of Scopes that has served as well ever since and queries in our system are generally

very pretty and easy to understand we're testing this. Like I said, we had an environment variable override of the feature flag. We use that to build two different rake tasks for running the two sets of specs. We had separate sets of factories. We set up CI running both sets of specs. And a for rectal testing, you know QA work. We did a lot of manual testing by developers comparing old version of users data a new versions and toward the end. We had a whole company q a swarm on our staging server. Our

CTO and at the time my boss just Martin gave me some excellent advice. He said since future development is going to be so slow for a couple of months doing this would be great. If you could find a way to radiate to the rest of the company that you're actually making progress that all this back and work that they can't see is actually getting closer to being done so we can get back to Future development again, so I wrote a custom R-Spec for matter that counted up how many total V2 specs there were and how many past spit that out as a CSV record and that got dropped into a spreadsheet with

a chart that was published into our project management system and showed that we were gradually converging with being complete with all this work. When it came time to execute we did select employees first those not participating in sales and demos, then the rest of the employees then some friendly customers who wouldn't get mad at mad at us. If they saw any problems. Then the rest of the active customers in the whole process of actually running all this stuff took about three weeks. After the initial round of employee and select customer migrations were done. We started the First full

batch of customers all of a sudden for the first time in about two months. I had nothing to do so I thought well I may as well start on the pr to rip out all the V1 and transitional code and 10 hours later. I submitted one of those my favorite PRS of my entire career and That was a very satisfying moment. After that here was our data model. The part were talking about today contacts were still shared and the table was there, but at least it was all in postgres. So time for stage two. I mentioned that the importers had to make sure not to duplicate

shared data that had already been imported from other contacts when I was pulling into postgres neo4j attaches uuids to all of its notes and I thought it would be a good idea for some reason to solve that duplication Problem by carrying that across I'm making the uuid the primary key of those tables in postgres that turned out too big to be a big mistake and I shouldn't had to fix it and that's stage 2 Don't get me wrong postgres uuid primary Keys work just fine there little harder to remember harder to compare when you see him in output harder to a

type, of course, but it didn't really become an issue until we needed to start tracking Source info for a different table that had an integer primary key. We attract sources using a polymorphic table called sore sings, although sourceable tables that it pointed to head uuid primary keys, and we realized we needed to start tracking Source info for this other one. We could have hacked around this a number of ways, but we decided to fix it. So when I first realized we would need to do this. I set aside a couple of evenings and did a

spike and I came up with a strategy and it works like this. So we start off by adding an integer integer ID column using contact names as an example and giving it a cereal the cereal option in the migration so that it's auto-populates based on a sequence and then we pre-populated it then we drop the primary key constraint on ID. Rename that to you ID. And then finally, we renamed integer ID to just ID and re-established the primary key constraint. They're simple we're done. That's the simple case. If you have foreign key references to that table it gets harder.

So here's properties and property notes first. We had an energy integer ID column two properties and we populated. And next we add an improper tid column to property notes and we populate that by joining across two properties. Now it's time to drop both the primary key constraint and the foreign key constraint and then rename ID to you ID. About halfway there. Now drop the old property key column. Rename properties. Integer ID to ID. Rename in property ID to property ID. And finally re-establish the primary key and foreign key constraints.

And the more tables you have that have foreign key references to your table. The more complicated this get gas. And this led me to developing some primary key help the migration helpers to do the various bits of this work. Another problem is polymorphic tables. It's a theme in this talk that polymorphic tables complicate everything and they did here and they end this all started because of one of those when you have a polymorphic table that points to some of these things that meant you had to convert all of them as a group in one

batch. We ended up with five separate clusters of tables that had to be converted as one and what does migration helpers to manage all those details and actually I went to the trouble to make the migration helpers reversible and tested those round trip with both the schema and the data to make sure you could go round trip and end up with what you put in. Just as a side note. I'm sure many of you know this but in a relational database you can discover anything about the schema from a set of internal tables and Views. So here's an example of finding out all

the foreign keys that point to the contacts table. You can do some more things for indexes and other kinds of constraints. So these three complex migration helpers for primary key foreign key and polymorphic references 5 migrations, and we ended up waiting five months before the pain of living with this outweigh the risk of going ahead with the change. The simple the first migration was very simple and very easy harder cases are a little more complicated in the worst case was truly horrifying changing all those tables in production at once overnight.

But when it came time to actually execute this we did a careful review of all these migrations and helpers got somebody else to look at it. We ran the migrations many many times on clones of the production database run it fix an error and repeat. I learned to be very very thankful for post resist transactional ddl. If I had every time one of these migrations failed in the middle if I had not known that the schema was still unchanged and not have finished or corrupted in some way. It would have been a disaster that safety-net

was fantastic fixing the error usually meant figuring out how to reflect on some new kind of dependency or constraint in postgres that I hadn't seen before and updating the helper to deal with that. Sometimes just coding a workaround. We tried to be very careful with this. We ran the migrations and staging to get the timings in our company. We have the luxury of being able to schedule downtime on the weekends, but we still wanted to know how long the maintenance man Windows would be. Like I said, we made them reversible with plans to never have to but you know, we we try to make

everything reversible when we're dealing with production data. We built random spot checks into the migration helpers. Remember I said we kept the uuid column for later record-keeping at the start of the migration help her we would grab a random sample of those were of the records from the table. We were about to change and store those and then after doing the migration before committing the transaction would verify that those all pointed to the same things they pointed to before and we wanted to make sure that even as we were running it in production. We had some spot checks

that would bail bail out if something went wrong, So after stage 2 nothing dramatic changed about the layout of the data model, but everything was uniformly using integer primary keys, and we were able to start tracking Source info for more things. stage 3 it's finally time to separate the contacts so that there was no data shared between Realtors swanand pagnis who has been my partner-in-crime for all of this work ran point on stage 3. He had a really good grasp on the problem. And what was required much better grasp is than I did so he took charge and I was

playing the support and validation roll. What drove this change will I've always already mentioned that nearly every query had to be filtered based on where the individual pieces of data came from. So we wouldn't show like Jose unlisted phone number that he only shared with realtor Abby to another realtor that wasn't supposed to know that that was extra complexity going through that flows polymorphic table was slow and finally there was a business risk their we knew that sooner or later. We'd missed something and violate customers data privacy, and that was not something we wanted to

live with. So she did a spike on this to evaluate different ways of accomplishing it with Ruby was very straightforward, but it would have taken about a day per contact a realtor and that right we would still be doing it doing in SQL required fairly Advanced skills and techniques but it took about 10 minutes. So we went that way and at what stage one we decided on a realtor by realtor approach migrating Realtors data is one of the time. Swan work this out pretty carefully. This is a page from his notebook

where he mapped out a fairly complex scenario and and looked at all the implications and proved himself that he understood what needed to be done and how to do it and he posted this as an attachment on one of the PRS to document that to the rest of the team and let them see what he was up to. The strategy works like this. I'm going to control a very simple example showing one contact that shared between three Realtors. First for us some some initial Preparatory work with added an old contact ID column to contact relationships and pre-populated it with the current value of

contact ID so that we could as well after we change the contact ID. We could know what the original one was and then we added a contact relationship ID column to contacts and populated it with null which I'm representing with an emoji. And then we added a uniqueness constraint for that column so that no two contacts could have the same contact relationship ID. So what start with Alice? First we update the contact relationship ID for every contact owned by Alice. If it's no you can think of this as

claiming that record that contact record this one now belongs to Alice. And then for each of Alice's contact relationships, we insert a new record into contacts. But if there's a conflict if it would have the same contact relationship ideas an existing record, we don't do that and instead just upset updated at but once that insert is done. Then we update the contact relationships 2.2 the new contact we've created where they exist this is all done in one big fancy query and this is simplified from what it really look like but we can is still pretty complicated we can zero in on parts

of it and see what it does first. We select all of Alice's contact relationships and the contact associated with them. And then we use that to insert new contact records. based on all of those contacts some of them will work. But some of them will run afoul of that on that uniqueness constraint and in the case where there is a conflict and you can't insert a new record cuz it would be a duplicate. We just update you set updated at to reflect that something was changed but not really do anything else there. In other words that that

record was already claimed and finally this is all wrapped up in a Common Table expression that called new contacts that returns all of the newly inserted contact contact records not the ones that got cancelled because of the uniqueness constraint just the newly inserted ones and that Common Table expression is used to update the contact ID field in contact relationships 2.2 all those new records. So back to Alice. We try to insert there, but there's a conflict on that uniqueness constraint. And so

that insert does not happen. So nothing really happened there because Allison all reclaimed contact one. But let's look at bill. We try to claim the contact for bill by updating contact relationship ID, but it is not null so that fails so we don't update it and we don't claim that contact for Bill. Now we insert and the insert works because it doesn't create evening this violation. And then the update fixes the contact relationship to record to point to

that new contact, but what about the attached attributes? Well, this is simpler than it looks breach of bills contacts were contact relationships. Old contact ID is not equal to the contact that points to we just go find that old contact and copy all of the attributes over down to the new contact. It's a lot of queries, but it's basically straightforward. And then we want the Carl but I'm not going to show you that part. So once again, we ran all the stuff many many times against to clone a production. We would run it for a

realtor and compare that Realtors data against their production data. We did a complete run-through of converting all the realtors in staging before running onto production and during that run through I plotted the changes to table counts as a sanity check and I saw something disturbing one of those tables is growing much much faster than than the others and then you'd expect it turns out to be our old friends Source things remember polymorphic tables compliment complicate everything and Swan and went and looked at this and you can probably guess what happened. There was an outer

join that should have been in her join. And so we fix that and reran that full run-through and I was able to watch that graph level out into a much more sane output and we went ahead and ran everything at this point. This was what our data model look like no more shared contacts, but the joint table is still there, even though it's not really necessary. And that leads to the final stage getting rid of that join table. What drove this change will I've already mentioned that everything is just a little more complicated with that

join table there. The database is representing it as a many-to-many Association when we have rules now that it has to be a one-to-many Association so that required constraints and integrity checks that if the joint table weren't there would have been necessary. It probably wasn't a big enough problem for us to proceed with it on the schedule. We did accept that one of my team members Monty Johnston challenged me like what can you figure out to get rid of contact relationships? And I thought about it for a while and then I realize that there was a way to set this up so we

could pursue the rest of this work opportunistically instead of having to stop everything and focus on it. The idea was we'd go ahead and add the direct foreign key relationship from contacts to realtor. And populate it to match the existing contact relationships. And there's a simple matter of making sure that the two ways of representing the association stay consistent using triggers. Real developers are wary of stored procedures and triggers for good reason, but sometimes they're exactly what you need. And this is one of those times. So I had a lot of interest

overcome. I had never written a trigger before I knew what they were and how they worked in and and the circumstances when you use them but never written one so I curled up with the post Chris manual NM try to figure it out. And there's some complications here. So what episode of this work all of our code with working directly with the contact relationship contact relationships table? And so when you did an insert into contact relationships to set a new association between a realtor and a contact within wanted to trigger to be invoked to go and set the real tragedy on the

contact to keep the two in in Woman at some point our code would switch over to working directly with that direct Association. And in that case we would want it when we set realtor ID directly on a contact or create a contact with a realtor ID said we would want a trigger to be invoked to go do an insert into contact relationships able to keep the two consistent. Now if we could have guaranteed that there would be an instant. We're on one side of that instant. Everything's working the old way on the other side of the instant. Everything's

working the new way better than fine, but I didn't have confidence that we could guarantee that and we didn't want to test it and that leads to the possibility of an infinite regress. You know, you you do this insert here that invokes a trigger that set contact ID over on a real Friday over on the contacts and then that in turn in books the other trigger. And it needs to know to stop because it's already consistent and likewise going the other direction. You set the real Friday that causes a trigger to insert a contact relationship which causes

the trigger to go try to set realtor ID and we need that one to stop cuz it's already consistent. Prayers are hard for me at least. It's hard to think about I'm just not used to thinking that way. I had to learn a lot of Arcane post SQL syntax that I wasn't familiar with proficiency. I had to control the conditions under which they invoked. You don't want even invoked the trigger. If none of the day that they care about has is changing for correctness. And in some cases the trigger needed to happen before the update happened the initial update and in some cases

after and then I had to carefully write those updates and insert so that they would only make a change if something was out of sync so they wouldn't trigger an infinite regress of these things and let me tell you if you've never gotten a stack Overflow exception from within the bowels of your database you haven't lived. So we came up with a plan first. This is part of the transparency again build a way to track our progress toward this to keep us motivated to show the rest of the company. We were

making progress step to build a way to audit the activity of the triggers. This is Roy to validation want to make sure once they were in production that they were doing what we expected them to step 3 add the primary key and foreign key reference and the triggers. There are three steps where I moved individual feels that were stored in contact relationships over to contacts directly to steps where I retargeted polymorphic associations, sore sings and tagging that initially pointed to contact relationships and now needed a point to contacts three more where I just

cleaned up all the rest of the stuff retargeting associations and Scopes and query fragments and finally step 12 was dropping the table. For tracking progress. I just set up a you know, a little script account the references to contact relationship in our code base and plotted it the same way you can see we went from a little over 200 down to nothing and in a couple of months three months. For auditing the activity of the triggers. I updated the triggers to log each to each invocation into this contact relationship trigger actions table

include which trigger it was the ideas of the relevant entities if it had them available and time stamp and then whether it actually performed update as a part of being invoked or not and I wrote a little utility script to audit this for consistency and we ran it periodically and if we had ever encountered an inconsistency where the counts of work pairing up like I've expected we would have stopped everything until we figured out what was going on. I did most of this work and I'd imagine I spent eight to 10 hours a week, you know

doing one of those steps gradually taking it through we did most of the appointments on weekends just to minimize risk, even when there was no down time required. And finally this is what the data model looks like and it feels good. It is a structure that really suits our business and we are poised for the next stage of our development. So what can we learn from all of this? I recommended this incremental slow and steady were spraying first strategy. It helped us contain risk allowed future

development to continue for most of the time we were doing this and in spite of the fact that at times it seems Slow It produced enormous improvement over time. We always keep looking ahead to things how we can improve our system architecture Ali we have an informal kind of pain inventory of what's causing us grief or what's too complicated at we'd like to change in the future and it allowed us to easily pick the worst one and and get it working on it when we had some free time. You'll notice each of these stages is different

entirely different solutions required at each step stage. 1 was Ruby magic stage 2 was migration and database Reflections stage 3 was fancy postgres insert on Canino UPS hurts and and Common Table expressions and things and set for with triggers an entirely different testing strategies to go with those. There's no recipe you have to figure out what works and figure out ways to make yourself confident that everything is right before you pull the trigger on it. One thing that they all do have in common though is leveraging the

database. We reels developers love active record and arrow for queries and that's fine. But for all its flaws sequel is enormously powerful and the referential Integrity protections that the database that good databases offer can save you and learn to rely on them and get disciplined about that. As I mentioned without posterizes transactional ddl. The risk of effort would have been an order of magnitude larger. And finally even stored procedures and triggers have their place. Although I was particularly glad that in this case. We we had a clear time

frame where we put it in and then at some point we could take it out. As I mentioned we have the luxury of being able to schedule maintenance windows on the weekend. If you can do that, if you don't have that luxury because of your business model, you have to explore other techniques and I would recommend pulling in an experience database consultant to figure out how to do some of these things in that kind of environment. These kinds of tasks really benefit from having a lot of focus. And frankly it helps if you have somebody on your team who's maybe a little neuro a typical

and able to get super obsessed with these things and not let them go. But at the same time that can can Blind you to oh, I'm getting too clever here or this is too complicated. Maybe it was a bad idea and we should rethink it make sure you come up for air and have somebody looking over your shoulder second-guessing some of this stuff to keep you on the right track. Would we do anything differently if we had it all to do over again if we had clearly understood our end goal? It's probably pretty obvious that

we could have done all four of those stages as a part of stage 1 just never copied the contacts over from neo4j never shared them. Never put the joint table there. We would have gotten all this way in the first go. And I really wish we've done that except we didn't know that this was our goal. We we knew that we weren't the benefit. We were getting from having neo4j was not worth the cost of having our data spread across two databases, but we still kind of thought we were building a social graph. And so we we probably would do the same thing again given

what we knew then it's dangerous to try to guess too much about the future of your business and how it's going to change. On the other hand, there's one thing we did that. I think clearly we would do differently in retrospect simply because we we didn't need to know the future of our business. We just needed to know some technical principles and that is we should never have used uuid primary keys for those tables. I didn't need to make it the primary key to keep it around and use it to avoid duplication. I just sick in another call and where it ended up. Anyway, eventually you got uuid

primary keys are only useful if you really need to distribute the work of generating new keys outside the database and I can think of two cases where you need to do. That one is if contention on the sequence in the database is becoming a bottleneck seen that happen. And the other is if you have really tight low latency requirements on the edges of your system or somebody needs to give you a few data and you need to turn around and say this is the ID you can refer to that buy in the future before you make the round-trip to the database. Unless you have either

of those situations stick with integer primary keys, please. so These kinds of changes are costly and risky. I don't recommend that you all run back to work and say guys come on team. We got to know but if something's really causing you pain you can. Make these changes safely by applying these three principles and just you know, being creative about the Technical Solutions that you apply at every stage. Thank you very much. And there's that job link again.

Cackle comments for the website

Buy this talk

Access to the talk “RailsConf 2019 - The 30-Month Migration by Glenn Vanderburg”
Available
In cart
Free
Free
Free
Free
Free
Free

Access to all the recordings of the event

Get access to all videos “RailsConf 2019”
Available
In cart
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “IT & Technology”?

You might be interested in videos from this event

September 28, 2018
Moscow
16
166
app store, apps, development, google play, mobile, soft

Similar talks

Matt Duszynski
Senior Software Engineer at Weedmaps
Available
In cart
Free
Free
Free
Free
Free
Free
Samay Sharma
Senior Software Engineering Manager at Microsoft
Available
In cart
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “RailsConf 2019 - The 30-Month Migration by Glenn Vanderburg”
Available
In cart
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
577 conferences
23312 speakers
8705 hours of content