Sandeep started coding and creating websites when he was 12 and hasn't stopped. He is passionate about building easy-to-use products people love. Before Google, he founded an IoT startup in agriculture and developed educational HTML5 games. At Google, Sandeep's goal is to make cloud easy and help developers create the next big thing. He works on cloud native solutions such as Docker, Kubernetes, gRPC, and Istio. Sandeep loves video games, making music, and martial arts, and has Bachelors in Marketing and CoView the profile
About the talk
Are you building or interested in building microservices? They are a powerful method to build a scalable and agile backend, but managing these services can feel daunting: building, deploying, service discovery, load balancing, routing, tracing, auth, graceful failures, rate limits, and more. This session will show you how the Kubernetes container management system and Istio service mesh can simplify many of the operational challenges of microservices, including an in-depth live demo.
00:30 Microservices are awesome
03:10 Microservices are terrible?
06:31 Managing services
11:21 Kubernetes clusters
15:40 Cluster deployment
20:50 Running multiple copies
25:41 Unblocking process
30:04 Istio dealing with faults
34:15 In-production testing
All right. Good morning. My name is Sandeep today. I want to talk to you about microservices with kubernetes and istio kubernetes for the past three years. It's grown a lot. And it's D. I was the next new hot thing coming out. So you're all here to learn about it. So let's go. Because we all know microservices are awesome. Right when you have a traditional application monolith, you'd have all of your logic and want a private job node, whatever. You have all the code in your one app. It
would do all the parts maybe using like object orientation. So you have different classes. I do different things but the end of the day, you know, it's like one big program that does all the stuff to do in this monolithic type of development easier to debug is easier to deploy you just have one thing you just push it out and you're done and we have a really small team and you have a very small app. It's really easy to do. What the problem with the monolith comes when your app starts to grow a little bigger. So now you have all of these things and all these different pieces,
but you still have one that's doing all of it, right? So anytime you want to make a change in invoicing. Let me get a redeploy the whole thing, even though the ETL pipeline might have not been affected at all. And maybe the invoicing part is really good to be done in Java the ETL pipeline you want to use Python and the front end you want you to know Jas can't really do that. So I want to make an update to one little part has a touch the whole thing and things start to get really hairy really quickly. So with microservices each one of these
becomes its own individual component in its own language updated and deployed independently of the others and so this gives you a lot more agility a lot more flexibility a lot more speed and a lot more cool things like that. And so we're like, yeah, it sounds really good. Why is anyone using monolith? Because once you start going even bigger it gets even more complicated right honestly a monolith in Fall Apart at this point too, but even all the advantages of the microservice starts to get really hard to manage because now I'm not you don't have the money
microservices you have 2,000 microservices, right? And how do you know which one talks to which one and how the spider web of connections even work you might have a single request that comes into your load balancer that gets sent to like 20,000 different microservices and then all the way back up and any's where is that Chain break your whole application looks like it's broken, right? And so debugging this become super hard deploying things making sure everything's in lockstep with each other. It becomes really difficult to try to manage this huge stack of services. So are microservices
terrible maybe but we can use a lot of tooling to try to make them more manageable. Right? So let's use tools to automate this infrastructure components automate the networking components automate the management side of things. And so when you look at a model X versus a microservice, they both have a lot of complexity the differences how much do you know something? I don't there is no tools out there that automatically right code for you and can maintain a 2 million line job, right? I'll be out of jobs maybe in a few years,
but now we do have tools like kubernetes like Docker like this you like Google Cloud structure a lot more easily, right? So a lot of the issues that we have with microservice can be automated away. To take a look at some of those tools so familiar with Dr. A lot of hands to go up. Yep. So for those who are not familiar with no code python Java, whatever is a matter of imagemagick some random Library that's in 13 year old kid in Ukraine compiled doesn't matter you put it in your Docker container and you're good to
go. All you do is care about running the container, you know how to make your have to care about running what's inside of it, right? So basically take a code your dependencies and you put it into some for the generic container format. And so doesn't matter what's inside as long as you can buy a container. You can run all the containers. Can I have a container go to actually run it on your Foster right? Because logging into a single machine and then running Docker run and then doing that thousand times. It seems like a waste of time to me instead we can do something like
kubernetes container orchestration system and what we can do with that we can say hey kubernetes run this container like 5 times somewhere in my cluster. The wonderful do is a little run your containers for you automatically, right? And so let's say you wanted to run this container that container that container and that container and you want two copies of each you just tell us that figures out how to run down. This is a really nice thing about this is a server crashes it'll figure out that the server crashed and run them somewhere else if I application crashes will figure that
out. If you want to stay lit up right to copy to four copies, you can say hey make it for and it has been up to more down to one cuz you want to save some money you can say hey to make one it will remove three right and so this Syntax, really easy to manage containers running like a cluster of these running try to do it manually impossible and keep your hands up if you've used it before. Okay, cool. But that's kind of this is starting to my right. Now you have these containers running you have to have to manage the services
because just container a talking container be talking container C talking to container. How do you manage that set of points going between each other? How do you set rules on who can talk to who can talk to what how many fries should you have how much Network traffic should you send us to come really complicated? And that's where it still comes into play sto is a service mesh which in the end of the day, it means that it manages your services and the connections that they make tea in themselves. so if you look at it orchestration, and we go to
Management and communication so you want to manage how these Services interact with each other because that's what really causes microservices to be. So useful is the communication between each other. You can reuse one service we can talk at three different ones. You have the same formation can mix and match do all is really powerful things, but it becomes hard to really understand what's happening and it still makes it a lot easier to understand what's happening and control what's happening. All right. So that's enough of me talking about random things. Let's actually move to a demo.
And if you can go to the screen, thank you very much. Okay, and I can't see anything. Let me know. So we're going to do first exit to go back to the slides real quick. Until you are we going to do. So, what we're going to do is walk through the end-to-end story of taking a nap and making it into a microservice so I can take a nap put into a container on the container locally store the container on the cloud. I'm going to create a kubernetes cluster Runner a van that cluster then scale it out. I never going to do something a little bit more work
simple app. Basically, it's the web server that listens to slash and it will basically ping a downstream. So it'll take time. Jason test.com and then it'll concatenate response and then send it to us. So let's run locally right now. Nope. Nope. Nope don't want no commercials. internet Okay, so if I go at 10 p.m. Start? to happen So it's listening on Port 3000. So I will. Let's go to for 3,000 lb can do a web preview. So it looks like we're basically running a local machine.
So you can see here that we go to time budgeting test.com and we print out our current service named just passed one versus one. I'm seeing the current time. So if I refresh this another time will change right? So now let's take this and put it into a Docker container. How to make a Docker container basically what we do is make something call a dockerfile a set of instructions that create the Priestly what it is. So here we start with no date Alpine. So that's kind of a base image. It just has a bunch of a
note stuff out of the box. We don't have to worry about installing Noche es then we copy in our package Json which house are dependencies run npm install to install dependencies copy in our industry atrasar code Expo some ports and then run and p.m. Start. So once you run Docker build, it'll create a Docker container container just for sake of time. I've already built it so run local To hear when I run that container once I built it, okay. Nope. I need to build it. It's just not found.
So this might take a little bit of time. Rebuilding this let's switch to going in the building are kubernetes cluster. So here in our communities if I go to Google kubernetes engine. Kookaburra, Nettie's engine is Google's managed kubernetes offering right? And so it's probably the easiest way to create a production-ready kubernetes cluster. So what you can do is just click create cluster and he got a bunch of options on what you can do to create this cluster so you can make it or a regional cluster and
this is a really cool feature where you can have multiple Masters. So it's highly available on those down. Your Foster will still be up and running. I can choose the version of kubernetes. You want the size of the cluster and if not next 21, let's not do that. You can do a lot of other cool features to that Google can raise engine gives you out of the box things like automatically upgrading your notes automatically repairing them has to get broken logging monitoring. A lot more to for example auto-scaling auto-scaling on which means that if you
have more clothes if you have more containers that can fit in your slicer Google kubernetes engine will actually automatically stay lit up to create a space even better. If you have a less containers will actually skill it down to save you money, which is really cool. I like. And then all you had to do is put create and it'll create that cluster. Sounds like cancel today already did one. It's like a cooking show. We have one ready in the oven. This is still going. That's okay.
What will do instead? I'm already built this and I pushed it. So we're going to do the opposite of what I said. I'm going to pull it down. So I can't read it. That's okay. So let's run that container locally. Okay. So we're going to do with Docker is a Docker run and I'm going to open up that poor 3000. And so if you look if you go back here, you can see that it's basically the exact same thing except now, it's running in Docker. And so they're really nice thing is we can have to change any for code.
You just put into a Docker container and we're good to go, right? Okay, cool, so the next step is actually pushing it to a container registry in the reason why we have to do this on our local machine where to put into a secure location so we can run them on a cluster and Google container registry is probably the best place to put them if it running Google kubernetes engine Right Stuff what's actually pushing up That's not how you spell make. make I love you to make makes
demo super easy. So what room do is run Docker push people read that or should I make a little bigger? All right people to read it and then give the the name of the container. So what will happen is they'll actually go to Google container registry. If you're in my sto test container, you can see that we have the newest ones pushed up right now this tag this version one that out is vulnerability scanning so we can actually scan your containers for known vulnerabilities automatically for
anyone to your Alpine Bass images. That's a really cool thing. Especially older containers. Do you have an update in a while you go back and look and you'll have time to vulnerabilities. So it's a really good idea to like check this out. It's in the beta right now, but it'll find a bunch of stuff wrong with your things and tell you how to fix him and then just go I'll take her usually use update your containers and you're good to go. So Now we have this pushed up reaction can start to deploy to a kubernetes cluster. Like I said before we already have a cluster created. Let's go back
there. No, basically the connector it we just run this command. Alright, let's run it. So now we can say CTL to the kubernetes command line to get notes just to make sure everything is working. Is everything working good question? Yep, there it is. So we have a 4 node kubernetes cluster. They're all good to go running version 1.9. 7. So here we have her committed cluster and authenticated to it. We can start running the same container that we ran locally. So.
We're going to do is youcubed TTL run demo and then give it the name of that image and the name of the deployment that we're going to tell if I take Cube CTL get deployment. And then give that namespace the name faces just for this. Damn 06. Don't worry about you don't need in real life. So you can see that we have our demo deployment created. and if you say get pods it's a pot are basically the containers and kubernetes. You can see that we have our container created. It's ready and it's running. But now that it's
running we had to be able to actually access it, right. So this is where I get a little bit more complicated in the old world world. It's running on one of those for notes and we really don't care which one so we want a static and point that can Route traffic and access fat but no, right so we can do is create a survey. And I forget what my makes me and is that's okay. I'll just go look like a cheat sheet. Let's see exposed status. Cool. So what we'll do here is Willow Run qcpl Expose and then we'll give it the Target Port of 3000 which is where are known as ASP is listening and
then run it on Port 80. So what's a normal HTTP Port type load balancer for a public IP address using the Google Cloud load balancer? so now we can say get service. Messi most people here are familiar with kubernetes. If you're not familiar coming to talk to me. I know I'm going to kind of quickly but there's more exciting things to to happen soon. So when I get to them listen to run to watch on that And it's done the moment. I try anything complicated. It just finishes it knows. All right, cool. Let's go to that URL. Not copy it. All right.
Alright, alright. Alright. Alright, we'll just do this. and if you go here You can see the exact same path running on our cluster. So now we have it running on a public IP address if you go there on your phone or laptop, it will work. What we can do or say all you go on on your phone as IP address will probably overwhelmed, you know ASAP so we can actually descale it out with the skill command. So let's do that. So it's a cube CTL. scale deployment dental appointment equals 5 I got that name
first. Deployment demo scaled to now to go back to our get pods, you'll notice that we actually have five of them running in our class, which is really cool. So and then she said get appointments. You can see it also says that we have five desired 5 current time up-to-date and 5 available IP address. It still works. The only difference is it's now round robin in traffic between all five instances of our application write a lot more able to run multiple copies on single Beyond
some get more utilization out of her machines. Okay. Let's say clean. All right, so let's clean that up. We no longer need that. Let's start looking at this deal that we had a single service running kind of useless in the longest game of things right now. I just want one thing we don't need to do that. What you really want to do is run multiple applications together and make sure they all work together. Well, so what we're going to do is I do is run this same application three times and then change it together. So you might have noticed in our code and then go back to it for a
second. Basically I can set whatever I want as an upstream URI using environment variable. It's a what I can do is actually tie multiple of these together using one's talked to the other to talk to the other and then finally the time that Jason test.com and then they're all they will all concatenate their responses and finally like shorts the end an end-user Dynamic look at that normally in kubernetes. What we do is create a Yama file a little bigger.
An individual file, we will Define are deployments. So I'm going to make for deployments to the first one to become called front end fraud. So front end Production service service named front and fraud and the Upstream. You're right. I'm going to call it middleware and so kubernetes will automatically do dns-based service Discovery so I can actually find my middleware application by just pointing it at Middle where this is really cool. I don't need IP addresses or anything like that. You can see here. We're giving it some labels upfront inversion fraud and then the same container
name that we used before. And then and remember where everything looks exactly the same except this app middle ver version fraud and then the service name is different and the Upstream your eye is now back and then we also have a canary version of a middleware seems like a cat version that working on HP how it works in the real world. again is also points back in and then finally or back into which points to time. Jason test.com Okay, so It's a novel create these deployments. But of course, you also need a crazy corresponding services so they can actually find each other. So take a
look at that. So here in our services. Llamo, it's pretty straightforward. We have a front-end a middleware and a back-end service. They're all open up for 3,000 to 480 which makes sense. The only difference here is a front-end service is a pipe load balancer so that we get that public IP address. So let's go and see what happens. Okay, still pending. That's fine. What's the time? the real Okay, so interesting looks a little bit different than before you can see that our friends and prod went to our middleware. Then the
middleware went to her back end and are back in the winter time that Justin test.com. But we're getting a 404 just really weird because you know, I mean if I go to this website clearly it works we were working all the time before right if it's working. So why are we getting a for 4? And so now we going to the world of istio actually have a steel already running in this cluster. And what is you're going to do is it going to really lock down and help you manage your microservices. So it knows. Com is an external service and by default it's going to block all
traffic going out of your cluster and this is really good for security reasons. You don't want your app is talking to random and points on the internet, right you want you as a cluster administrator to be able to lock down and only talk to trusted endpoint. So, how do you unblock it in its deal? We have something called egress rules. So it's pretty simple thing. I can see it's only like a few lines going to say no traffic to time that Jason test.com on both Port 84 HTP and 443
for http. So what's the point that rule? And for it to know what you said, it's still control command line tool. It's just like to keep control or keep CTL issues detail. I don't know how to pronounce it your guess is as good as mine, but very similar to the control a tool and now we've created that we can go back to our website. I hit refresh and you can see it's all working perfectly fine the traffic to Time by tracing test.com No, you might notice another thing going on refresh. Look at this line right here. Sometimes it says Canary
and sometimes it says prod right into the reason why this is happening is by default kubernetes will use round-robin load balancing for its services. Right? So are Middleburg service is pointing to any deployment or any pod that has a tag middleware but both the canary and the prod versions both have that tag. So cool guys will blindly send traffic to bow. In fact, if you had three versions of canary and only two versions of your prod a disproportionate amount of traffic would go to Canary and pack 322, right because it's just round robin ate some of this year. We can
actually make traffic go exactly where we want. What's the weather? So we can do a second to call a trout roll. And basically concerned whenever the destination is Middle where you know always send it to fraud. And then whenever a destinations front end since the fraud and back into the front, right super simple things, but what this happens is now hit refresh. It'll always go to our production service. So as an end-user I never have to worry about it. I'll always have to
touch and I won't hit some like random test build that's running in my cluster, which is really nice. What this is great does Alphas like bulletproof? Right? So simple it never breaks, but in the real world we have our code is not perfect. Perfect. In the real world things break all the time. So to stimulate that I have is really cool function called create issues is it'll look at a header called fail and then if that would generate a number and if it's less than that, it'll just returned a 500. Yeah. I didn't know how else to make things break on purpose to make things break on purpose.
So Jason and things like that automatically, so let's just do a normal request right now works just the same way as soon a request from the web browser, but I can actually send headers from the stool. And what's that a value of 0.3? So it's a 30% chance of failure. Let's see what happens. Back in sale boom middleware failed everything works back and failed again. Everything worked again Middlebury field. They're everything fell right and might notice that the problems actually worse than we think because in our code what we're actually doing
is we're propagating these headers on each request rights to 30% chance. It's right and that's where all this cascading failure come to play with microservices because one failure comes trigger tons of failures Downstream. So take a look. How is Theo can help us with this? So let's do our second riding rules. Let's just something called a simple retry policy. So let's have istio automatically retry the request three times before giving up. And again, we don't have to change our code at all. Right. It's deal is transparently proxying. All of these Network called boxing
is a strong word transfer we managing all these Network. So what happens is Arco doesn't have to know at all that is still trying to three times. It just tried it once and is still actually manages all the back off in the retries and all kind of stuff for you automatically, so you don't have to know if we go back to postman. Let's do it happens. I'm working working working working working working working. Much better obviously didn't fix the issue. In fact the fire increases to something like version
0.5. You know, it's going to stay off a lot more or not. You know, it might not feel it might mask that okay, you know what is 0.9. All right, big failure. And you might notice it's feeling a lot at the front to write. So let's take a look at the two things one. It's able to mask masks your mask mask your failures, but you don't want it to always do that. Right? Just cuz things are working to say mean things are good. You want to be able to detect that you have
errors and be able to detect that and fix it. So it's to actually gives you a lot of pulling out of the box to manager for your systems. The first thing I want to show you is something called service graph. So because it's still sitting and intercepting all those that were called is actually able to create a full picture of your services for you. So you can see here. We have a front-end talking to her prodding Canary talking to her back in the time that Jason test.com. We can also start doing things. Automatically start
getting metrics from our cluster as well. Wow, that looks way too zoomed in. All right, see my little bit you can see here adding those others errors are Global success rate to started crashing into the bottom. Right and so is still actually automatically find your 500 or 400 your volume York UPS your latency all the stuff automatically for you and start throwing it on to Prometheus and other dashboard so you can start putting them into your systems and start running metrics and understand what is going on in your services. Right? And I can head to write any
code for this it still gives me all these metrics out of the box for free. And then finally we can use something like Zipkin or Jager to do tracing. So trace a lot of these tools all work. Was find some traces. So you can see here. We can see our friend and talk to her middleware talk to her back in obviously the back and takes the longest amount of time because it's talking to her external service, right? But even more interesting than this. You can
see here and actually has back into ex-friend in 2x in Middleburg 2X, and I can't find one. That's okay. Let's see look back one hour. pine tree That's okay. So you can see all the distributed tracing and to do that. All I had to do was for does trace headers along? Los Tres headers as long as I'm forwarding those trays headers I get to Summit Racing for free out of the box and I have another talk open senses, which can automatically forward these for you as well. So even less work for you at the developer. Okay. Now let's do one final thing so you can see when I hit refresh.
I'm always going to my prod service. But as a tester, I kind of want to actually use a canary service and see what happens in this new path. Right and they're really nice thing is we can do in production testing using this deal because when we have thousands of services, it's impossible to watch them all in our local machine and test it that way right we want to push our service like, you know, you think it's working better. Hope you don't want any production traffic. So we're going to do is put the last and final route roll. It's
going to make a new route call the Middleburg Canary route. And what's going to look for is a header haldex diffuser. And then whenever she's a valley of super-secret. It's going to round it to Arkansas to all normal traffic will go to a production service and then the super-secret traffic will go to our Canary service. Obviously, you do something like a steam not just the worst super secret cuz you know security is important. What's the farthest room in so now if it will still go to a production service? But we got a postman and it's remove that fail header.
And let's go to xdev user. super Boom. Nope, I can I spell it wrong. I don't know. Let's see what happened. What happened to go this disabled? There we go. Thank you. All right. Marriott Middleburg Canary right route to a specific service in the middle of her stack using headers and that's because resume header propagation. So even if you had like 20,000 mm services and you want to make the middle one. 900 service a special thing by propagating these headers
we can actually use is the ONN route to that one. Even if it's the middle of her staff, right? So not just a friend and we can test anything in the whole stack propagation. Atlas switch back to the slides, please. Okay, so we do all this stuff. Thank y'all so much if you want to get a deeper dive and it's still it's basically this top of the little bit more things Focus specifically on his deal. You can check out. My name. Cam is still want to want it will have a YouTube video. All this code is open source on my
again. If you go to that website is still 101. I have a link to my GitHub repository there and check engine at J. Cole said she could follow me on Twitter. If you have any questions, please follow me out the door. I'm going to go staff the office hours right now to come talk to me there or talk to me right outside or find me in the sandbox. I'm happy to answer any questions. I know it went kind of quickly today tonight and I'll see you out there.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.