Experienced technical lead with 7+ years of engineering, project management, and design experience with a focus on delivering products in markets underserved by technology. I love finding solutions to ambiguous problems in a cross-functional role that brings measurable impact to real-world problems.View the profile
Purveyor of Software Development and Product DesignView the profile
About the talk
RailsConf 2019 - How Checkr uses gRPC by Paul Zaich & Ben Jacobson
This is a sponsored talk by Checkr.
Checkr’s mission is to build a fairer future by improving understanding of the past. We are based in San Francisco and Denver. We have found some limitations in only using JSON-based RESTful APIs to communicate between services. Moving to gRPC has allowed us to better document interfaces and enforce stronger boundaries between services. In this session we will share the lessons we have learned while incorporating gRPC into our application, walkthrough setting up a new rails project with gRPC support, and how we plan to expand our usage of gRPC in the future.
Alright, thank you for joining us today. We'll talk about how Checker uses grpc first in audience survey. Have you ever worked with an API with an API without documentation raise your hands? It should be everyone. It happens all the time. Have you ever worked with an API with inaccurate documentation? Maybe it will last but sometimes certainly for internal Services may be a little more rare for public casing services. What about an API that returns inconsistent types, or maybe you don't know what the
typing exactly is all the time. Checkered the number of internal influence and services for us has grown over the years so are our monolith tour model with today has over 500 in points and 20 plus other additional Production Services in in the critical path bar application. And so we started to butt up against this problem of undocumented anonychia. Etcetera. What is Checker subtractor is an API first background check company. We provide modern fly background checks for thousands of customers companies like uber
and Lyft in GrubHub and instacart and r r a p i need to be available. Otherwise, they can't hire and so our systems have to be resilient and an available. I'm Ben Jacobson. I'm an engineer in this is Paul. We're both Engineers on the tracker team. And this talk will cover what is grpc trade-offs using it for something like rest and we'll walk through a detailed example of how to get how to get started. So let's first start with what is RPC. Sewrpc stands for remote procedure call and it simply is anytime
you have one service that wants to call another service and ask for information or ask you to do something that process is an RPC call. And in this example of service a it says Hey, I want you to do a thing to service fee and service be respond with okay, you know, here's the thing and historically there's been lots of ways to solve this problem sociopath is a common way. I have anyone use the soap API here we use soap a lot a lot of our Integrations are our Legacy integration that you so this is an example. So call you're basically precisely describing the call that you want to
make how you want to make it and with what input. There's more modern approaches for things like rest using swagger. So here's a young male definition of a endpoint stock. / for example, that will return information in in a precisely defined way. But what is grpc and grpc is really just an open-source framework from Google to make this whole process easier and available in several languages, maybe not any languages but a lot of languages and make it super easy. And so this is what the ideal looks like for grpc is very
similar to the soap version that we just looked at in the respiration that we had the Swagger version. We just looked at but basically this is the ideal interface definition language do the thing that precisely describes the API that you want to implement and you can see here. We have stocks API or stock service that has a get stock price and you can see the exact request that it takes and the response will get back into an overview when I think of our PC. I think I really a bundling of these Concepts. So we have an IDL. We have a precise definition of what our service does
we have an implementation. So it's whatever the process is either human or otherwise to actually Implement that i d l you have a protocol which is basically how the two Services when using the implementations talk to each other and then for us which is important is documentation. How do you help others within your organization understand? available And so if we compare these kind of concepts with rest or soap which we are all familiar with we can see you like an ideal for rest could be something like open API or Swagger and implementation. Maybe you could use Swagger codegen or you
could hand write a act party class to to talk to an endpoint. The protocol that you use is Json or XML over HTTP and documentation options might be a Swagger UI open source product or or some kind of Slate's documentation and pour soap. It's very simple sew a wisdom. Has anyone actually look at a whistle there a giant XML documentation precisely describes the whole stove API that's really an IDL. It's really just a description and then you can pass that whistle to an application. For example Sivan, which is a ruby implementation of soap or
C, which is python the protocol it's using under the scenes XML etcetera and for grp see it's really just the same. It's really just a group. Technologies that help you implement the stack of Concepts. So for grpc, the IDL is a Proto definition. The implementation is is actually generated for you. So the code generated it is generated by the grpc directly directly in the protocol uses behind the scenes something called protocol buffer over HTTP two in this case and then documentation we happen to use is an open source open source project
called Potosi doctrine that helps generate documentation for the rest of our organization. So it's really just a bundling of these Concepts. And so do your PC is really an opinionated framework for achieving our PC and you can think of it kind of convention over configuration which at a rail conference. Maybe we're all fans of it's just a way to achieve this. And so should you be using grpc in the answer is probably not if your organization is trying to scale and you have several Services it become more valuable, but you can get a long way with
rest rest is great render. Json stuff. I mean when I first saw this line of code in a rails project like this is an incredibly powerful thing. You can move very quickly with this kind of code. It just doesn't scale to a team and maybe if you do need more flexibility, you can start to introduce things like a stuff Siri Eliezer that can help you precisely, you know, describe what what you want stuff to look like to the client. And this will get you even further and Jason is readable everywhere. And so if we look at a Json example the document on the left, you can into it what
what it's trying to describe even a non-technical person could maybe into it. What is trying to describe versus a protocol buffer, which is the way grpc communicates is binary and it's much harder to see what's going on. Maybe even impossible for a human to directly see what's going on. And so you should probably use rest for as long as possible for the hope within your organization. But Json API start to feel a little like searching in the dark as he scale as you have more services, it becomes harder and harder to understand what's available to you. And so here we have service a and it wants
to know what is available in service B and rest and Jason don't give you kind of the tools out of the box to achieve this you're going to have to go out of your way as Engineers to make documentation or or build it into your pipeline and in some way. And things get even harder as you know, one service turns into three services and three services turns into several services. And so do your PC is an approach to solve this problem. Here's the idea that we looked at earlier the the profile and you can think of this is like a schema. RV for kind of
all of your services within your organization a schema. RB is very powerful thing. You can open up any rails project and see exactly kind of what day does persisted how things work generally the connections between things. This is just a way to achieve that across services. And so the way it works is we have our ideal. It goes into an ideal repo. We run grpc tooling on kind of the whole IDL. That grpc tooling outputs code and in several languages in this example, we have Python and we have a ruby and then independent Services can import that I found in Ruby
code and use it to implement either a call to the service or response and giving date of back to a caller on this example. We have python client is Cersei and it says Hey, I want to get stock price and I'm going to send you an explicit stock price for Quest circus p i r e v server looks up the price and sent back and explicit stock price response and we can even not these Concepts two things. We should already be pretty familiar with an ideal repo again, you can think of it like almost a create table users or your schema. RV file. It describes kind of your application your PC to
join you can think of is rails DB migrate it actually change the Estates of something under the hood. Give me my grade is a table to your PC to leave my ID code. The Ruby code you actually have to implement now so here for for active record. You have to actually create a class that actually implements the thing but you're given kind of all the tooling necessary through application record. And then when you when you actually want to use it, you just kind of call safe and in reality active record is abstractly RPC under the hood here making a call to a different service to do
something you're getting back response in this case true or maybe a validation exception for something so our journey to grpc. You can think of our journey over time as more more services became available Moore & Moore Engineers started working on a product. It becomes hard to juggle all of this together. And so we started to slowly experiment with your PC introducing it to some services and then slowly have introduced it to Services overtime or more services and especially new services that are spun up today. I wish I could stand up here and tell you that there's some
mythical point in time in which I can definitely say, you should switch to grpc. You know, you have 25 Engineers I owe you 50 services in reality. It doesn't exist. There is no exact right moment for you to consider this technology in reality. It's a giant gray area. There's pros and cons to both and in fact, you can build a whole business forever using just resting Json. And more specifically this is kind of weird Checker is in our migration. We are we still leverage rest in Json Swagger. Is it our public API documentation is generated through Swagger. But a lot of our
internal services are now adopting this technology. And so now we're going to do a very detailed walkthrough. I was going to walk you through a very detailed walkthrough of how to accomplish this. Thanks, man. I saw like that instead. I'm going to take you through a detailed walkthrough and kind of show you the the workflow. We actually use a checkered everyday to implement new Jeep grpc endpoints and services. So let's start out with an example in the checkered context. So
one thing that Checker has to do everyday is be able to understand when two records matched to each other based on an identity. So one important component of that is being able to understand when two names are matching to the same just the same name and this is such an important piece of our stock that we want to be able to expose that too many different parts of our product and make it very seamless for these different services to interact with So let's talk about what the first step is the first step in this case. I'm going
back to Ben's examples earlier is to write the IDL are for this particular named after service. So first Mewtwo find the service and what that service does do we have named after service and we're going to define a method that I can respond to an estate RBC not match and that takes in not request and returns. Matt response mattress request actually uses a custom to find a message in order to as argument. So you have a name and the name d and that particular those two fields are using person name, which is the new message type that we've defined.
So that takes Stringfield first name middle name last name. Finally the service is going to respond with responsive with types Boolean for match and a float with a confidence. And went when the developers finish writing that definition. The first thing we do is push that up to get hug to Armando repo for Checker IDL where we host all of our definition files and then we'll go through the typical workflow where we have a request submitted on someone on the team will review that pull request for the
new definition will make any edits. We need to make men in finally will merge that into our Master branch on GitHub. At that point we use webhooks from GitHub to build a circle cir run at the Circle C. I build in this runs the auto generated code from grpc to play and then pushes that bundled code into a new gem version on our private gem server via jump, Erie. And you could do that for any number of different languages. Additionally. We also trigger a webhook to code amp, which is our internal posting service and I will
run another build that generate HTML documentation at that point anyone on the team can go to IDL. Checker HQ. Net which is our internal internet and start to look at the documentation in HTML form. So this is just a preview of what our name matcher docs are going to look like again, you can see it's basically it's rust. It's exactly maps to what we just created in our definition file. But now we have a nice readable version that anyone at the company can look at very easily. So how do you
actually install or start to use the auto generated code? Like I said we have now a private gem in our repository that you can start to access. So you simply need to find a new source for the internal repository and a gem install door jamb Checker IDL and then touched by the version that you want to start using. I generally we we always advocate for Ford compatibility so are backwards compatibility. So you're always safe as a client using an older version of r r i d else then you'd run bundle install and you're
ready to go. Again, you can do this in any language that supports grpc. So in this case, maybe you include checkered IDL in your requirements. Text and run pip install Carmen's not taxed. So let's talk about how you start to serve grpc requests and what that looks like in Ruby. So a very simple review example looks like this from the auto generated code. You know, how now have a service stub that you can start to inherit from I'm in this case going to call IDL name matcher service and the service expects you to define a method that maps to the definition
file you just created earlier. So we need to cry. About the death match that takes Matt request. Once you have that you can turn to interact with the mat request object that you to find in your definition file and then you're simply expected to return a mattress Vons object as a final return statement in your match method as you can see here. There's nothing expose around Network implementation and you really just have to focus on what what the code supposed to be doing. So, how do you start to how do you boot the server up?
The very simple explanation is that you're going to create a new RPC server instance you're going to attach to a port locally. And then you just please specify to the server that you want to handle that particular type of service definition. So in this case is named after service and then you just going to run that forever. So, how do we use that in rails? This is this is pretty great. But maybe we want to add some additional features on and make you really can. Make this ready for production. Well, we found that
there's a great open source, Jam called growth that is maintained by Bigcommerce. And this allows you to use a lot of the paradigms of rails inside of with grpc. So just a small framework that you include in your junk file and you can start to use some of the parents there. So their their big the big Paradigm with grasses that you will create a controller class on maps to you service and you can put those in your RPC directory. And then again, you're just going to define a method that
Max to the service definition. So on top of I just giving you a very kind of similar framework as what you'd expect with a r s controller graph gives you some nice additional features around 2, so you can start to look at middleware and end login your request you can Look for authentication. You can do a lot of other nice things on before the request actually hits your controller here. And then finally, you can just run bundle exact graph to boot up all of your service definitions. So what does
it look like to make a request on your RV on your client? Well, the first thing you need to do is create a new clients sub that map of that that's pointed to the location of your name At Your Service, that's running. So now we have a client instance. But first we need to build the request. So what does that look like? So I just want to highlight here that you can start to interact with any of your a definition objects and start to look at them just to understand what format these requests are
supposed to take in this case. If you are new to using the name after service, you can check out person name and see that it takes a first name and middle name and last name and no should all be strings. And there are some really good are handling that that's included here for free. If you thought if you try to sign into a field that's not available. You'll actually get him at the missing are here and you should know right away that you're not using that object correctly. And I do flies to instantiate a new object directly with
arguments as well. One thing we found we had some trouble with a checkered in the past is that will structure names in different ways. We might use a string a name in different ways. So we may use a string format in one case an array in another and a hash finally as the final representation and you can see here. We're very clear about what structure we expect and will get an air immediately if we use the wrong schema. So finally let's let's create our names here. Like I said earlier we have to understand context around names all the time. And so raise your hand if
you can if you think that these are the same person Obi-Wan Kenobi and Ben Kenobi. I think we all anyone who's seen Star Wars and probably make that connection but we need to we we need to understand as a business all the time whether two names might be Associated now, we have are two names and let's make a request. There's a little anticlimactic but it's as simple as calling client. Match with two keyword arguments name a and then B, and we're going to get a response back. That response is going to be again a ruby
object and we can call response. Match and yes, Ben Kenobi and Obi-Wan Kenobi are the same person and response. Confidence is very confident this case. The one bonus we've we've been able to have using a photo box in general is in using IDL is that if it applies to other parts of our service communication beyond the simple client-server model, we use a lot of a producer consumer Q's at chukker and being able to enforce message types. There has been very useful friend for stabilizing some of our critical path. So just very quickly just to show
you how this might work in a simple example, if you wanted to send a a Name ID LR2 on to Accu so that other consumers could come pick that up. You would simply need to encode that idea object and Ruby to Binary. So that's encoded name and then publish that to your queue. And then later on when a consumer again in any language, that could be Ruby or some other language and then decode that message get the expected output from that message and do work on it. So this gives you a lot of
really helps you enforce boundaries across your services no matter what type of architecture you're you're using. Finally, let's talk a little bit about throwing in handling errors. If anyone's if you've ever used not working before, you know that there are areas that are going to happen. So first off grpc status codes grpc does have status codes and they roughly map to different HTTP status codes in this case even see the number zero maps to 200. Okay. I'm sorry to say that
418 I'm a teapot is not mapped into the grpc a definition to Austin just do without that one. So what's what's I've been to what it looks like to throw an error on the server. In this case. We want to validate that each name that comes back or in each request does have a metal name and just for this very simple example. We want to raise back to the client that they should they should include middle name. So how do you do that? You simply raised a new air grpc bad status with
that particular status code in this case invalid argument which basically indicates that the client has done something wrong with inputs. And then you can send back a string that says middle name must must be present. So then your client and start to work with that you may be saying this seems a little limiting in terms of what you can send back as far as I are messages go. So, let's say you want to have something a little bit more structured you could easily Define a hash of heirs similar to active record resource
Ayers, and then you can see realize those two Json and son. Back as a the message that's included as part of the the grpc status Air. And finally, if you really wanted to take this further and you want to have structure across your entire IDL, you could actually Define a common definition of what an air looks like in your on IDL and you could then just in code that as the message that's included back with the air. So on the client-side this is going to look like you're just handling Ruby exceptions. So you can rescue grpc bad status. I
mean you can see details which is actually going to be that are message that you've passed back. You'll get a message which is a strong representation of the code and you can see the the integer code as well. That's a fine in grpc. And in the auto generated code you get some additional life two halves like subclass versions of the the different airs. So you can do it you can rescue for grpc invalid argument specifically and handle that differently than another type of status error.
So just to take a step back. What was Sheriff you take away is we've had a checkered over the last year of using grpc. First off. We just want to re-emphasize that rest can take you a long way. We're still using rastall all over the place at Checker and it's alive and well. Secondly, we see grpc as an opinionated framework for service communication. I'm finally where where we seen grpc start to influence. Our development cycle is that we see it influence starting to to force us to think about
contracted driven development. So when we start to build new Services, we really think about what the service boundaries should look like and what those Services should perform and do in their roles in our service. To give a quick shout-out again to open source. We're using grpc Groff is the framework. They were using inside of rails to make your PC easier to use and then finally we use Potosi gen dock for our documentation internally. Thank you. So we still have a 10 minutes. If you have any questions, we could take it offline or you can ask people questions. Yes.
It is 1 / 9. I got a group of services. You can run multiple services on one one server. It's kind of up to you on how you think that makes sense. And what makes sense for it for you. Just elaborate there a little bit more you could handle for multiple just using the simple Ruby example, you can handle for multiple services that you've defined here and just have a ruby server routing to those different service service calls. Yeah, so so there's two questions all of these libraries that are generated now become dependencies and that is true.
So these are now the dependencies of every application that wants to kind of make calls to your service and there's a build step to build a dependency. So years of developer can just import it directly and your your second question was do you have to Reemployment kind of a lot of your data model again into these ideas in an answer is yes, I guess as you're slowly converting to this if you choose to do so you would have to kind of worried maybe re-implement Theory describe at least your data model so that each each service that wants to communicate with with whatever whatever
you're implementing knows exactly what it's going to get in response under the hood. You can continue to use active record to save the data or whatever, but you're going to need a grpc object to represent that data to the other services. Yeah, so the question is can contain imagine a world where maybe active record and IDL or maybe the same object or her share some kind of code. I think that's a that's a really good idea. I don't think I've seen it done and in my research, but it's certainly something that's possible to do
and that would at least prevent you from Unisom amount of code duplication on your server when you're implementing, you know that the match call or whatever you're implementing. That's the ticket that you get a call out. Yes to the question is do we utilize streaming at all? So we don't today we don't really talk about it, but it uses a c2p too and has bi-directional streaming kind of out of the box. We don't have a use case. At least today that I've seen where that would be useful you might have a business that does like real-time messaging or something to a client where would be
useful just but for us and our domain it hasn't been particularly useful for us. Yes. Oh, okay. So the question is do we use do we use it only for trusted back ends and do we handle authentication today? So today we do use it for only trusted back ends with an R V PC. So authentication hasn't become a huge concern yet. But as as we do it two things that are maybe closer to the to the public Internet. It's something we might have to start considering that I can't speak to how you we would implement it quite yet cuz we
haven't had that problem. The easiest would you have some note services in that case which are still trusted back ends, but that's where it's it may be a started touching the surface of of public apis. So the documentation says it's not I think it says that you can swap out different implementations you you could use Json if you wanted instead of protocol buffers. I don't I don't think anyone that check her hasn't as experimented with that hasn't really been in need of but if if you did want to swap out of how howdy the
protocol is working. I think you have the ability to do that. There are translation layers that you can run on top of your infrastructure. So you can convert Jose a call comes in. That is Json. You can convert that call into something that then calls into grpc. There's a lot of different Frameworks that have that do that. I don't think we have any of those running in production today. But again, it's something that we might start thinking about the future. Grpc WAP. Yeah, so that's the translation one of the translation lyrics I think there's a few but your PC web will convert Json request
into a grpc call to your grpc service and then convert the response back to Json so you can consume it from like a web a web service today. We don't we don't do that. All of all of our public calls are through Gates on apis that are you would normally just build with rails up again. It's something that we would consider in the future today. Our Focus has been on internal communication within services so that the protocol and stuff like that hasn't mattered as much but if we ever do want to do it on the web that we would have to start to investigate those Solutions. Thank you. Thank
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.