Duration 20:43
16+
Play
Video

Self-service Your Cloud Through Automated Remediation; Without Losing Control (Cloud Next '19)

Thomas Martin
Founder at BigCo. SmallCo.
+ 1 speaker
  • Video
  • Table of contents
  • Video
Google Cloud Next 2019
April 9, 2019, San Francisco, USA
Google Cloud Next 2019
Request Q&A
Video
Self-service Your Cloud Through Automated Remediation; Without Losing Control (Cloud Next '19)
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
973
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speakers

Thomas Martin
Founder at BigCo. SmallCo.
Brian Johnson
CEO/Co-Founder at Divvy Corporation

A former CIO at the General Electric Company, Thomas is founder of BigCo. SmallCo. He has led the migration of 9,000 legacy workloads to public and private cloud infrastructure.

View the profile

Brian is co-founder and CEO of DivvyCloud where he leads corporate strategy and product innovation with the goal of making security, compliance, and governance accessible for those running hybrid, multi-cloud environments (and thus enabling the future of cloud computing). Brian is a passionate technologist who enjoys sharing his insights from the front lines (you could call him a geek and he’d agree). His career has involved a number of research interests, and the opportunity to develop practical applications. Areas of focus include cyber threats, information security, dev ops, hybrid cloud, and software-defined infrastructure, such as building serverless architectures with microservices and container deployments. Collaps

View the profile

About the talk

How can you have automated remediation of your GCP & Kubernetes environment without losing control or the freedom to innovate? Join us to learn how large enterprises achieved the competitive edge of rapid self-serviced public cloud deployment while staying secure and compliant with controlled automated remediation. Find out how they changed their security mindset, defined and implemented guardrails, and deployed the tools necessary for controlled automated remediation.

Share

My name is Thomas Martin. I'm the founder and CEO of big Coast Mall Co and so we are technology integrator that brings startup technology and Innovative Technologies to the big company Enterprise Al previously. I was a CIO and CTO across multiple GE industrial businesses and had the pleasure of leading the effort to migrate over 9,000 workloads to public cloud. And today. I'm here with Brian Johnson. My name is Brian Johnson on the CEO and co-founder difficult prior to taking on this journey. I worked Electronic Arts for about 7 1/2 years working on massive multiplayer online games

deploying. A lot of games are on Davidson's around the world and led the charge into migration the clouds about 2013. So it's through that process of the migration one of the things that we thought was really interesting. Is it as we adopted Cloud we started treating it like a technology problem, but it didn't the day that actually wasn't the case was actually a culture problem and what we realized that shift to self service was incredibly important for our ability to compete all of our competitors. Are they building games and do it much faster and gain those

products to Market and learning from their customers and building really cool games. How can we continue to do that? How could we can say to innovate what we couldn't do that if I T continue to get in the way, so we need to find a way to allow our engineering organizations to access Cloud to build a ploy applications to innovate through their Cloud infrastructure without security and I T really getting in the way of that and through that process when we discovered. It wasn't just a technology problem, right? It was actually three things that came together in this earth perfect storm that

caused an issue for security removing with more everything. So we're moving from Neil Simon said was a lie was not the actual number inside of us so that so that was really interesting process also had this mechanism wear number of people touching the infrastructure dramatically increased. It wasn't just the IT staff anymore with thousands of Engineers all over the globe touching every structure deploying applications, which led to the third thing. We are no longer server Hunters by we were basically having change everyday cicd

process when will code change a bug fixed tear everything down rebuild back again, so that led to dramatic increase in number of changes that were occurring in the infrastructure to give a moment. So these three things combined number of resources you how to manage number of people touching those resources and how often does resources were changing LED this incredibly difficult problem. How do we deal with this scale? How could we possibly as an organization understand everything was changing and react to it in any reason. Of time based on traditional it and security process again. Switch

work slow. So that was the problem. We see out there. This is not a technology the company transformation issue. How do you get ahead of this? And how do you deal with scale? So with that was just kind of talked about and what is this really begin to mean from a practical aspect? Right? I mean fairly simple, right? This is just a simple 3 tier architecture. The Mist the opportunities actually seemed quite small right at me and you just got a couple that you're going to load balancer a couple computes Rio spanning across a couple couple of availability zones. We got a cloud storage.

And Cloud SQL point is is if you're just managing in this in a small scale with a single team not all that hard, right, but really when you started looking at it, there's at least 20 + opportunities for Miss configuration just in this simple three-tier architecture. Think about that when you begin to migrate say 5-7 10,000 applications across the Enterprise to try to manage this at any kind of scale just becomes on wheeling and what I found was actually for ourselves in my past experience is really been somewhere between about a hundred and two hundred applications. The team does the

whole structure starts to fall down you really have to begin to think about not only is that cicd process so important but it's really about all the configurations not only real time app on deployment but on going in on forward So that was his lead to well in our case it led to a couple of different things that led to loss of control by we're letting Engineers go and and so are deployed the great thing that you want that Innovation innovation in order to survive the company. You've got to find a way to compete through Innovation

first, so we would know a price we would see a mistake. We be able to stop the map that's not necessarily the case anymore. And further more interesting is All Things Considered. So it means that you went from having an IT organization who hadn't place to build the catch. These issues do their sort of control the gateways a had to Engineers are doing all sorts of things all the place and the problem is nobody ever stop down the internet sensation explain to him 20 years history of security issues that we hit. It's not

like I see learn that stuff the easy way and we got compromised we had problems we had issues since we learn the process. Which of these processes really slowed us down recognized as a move towards a more Cloud never approached really? Well, we'll just do a learning will just basically get alerts every time there's something we need to pay attention to really quickly got out of control. I may just became whack-a-mole. It was just no way to keep up with that I have alert fatigue on here. Is this what we talked about that is getting those slack message your emails and how do you know when to

pay attention to what are the important areas because the reality is of 20,000 changes. Now you're dealing with does every one of those it might be really important and you may have a hard time identifying which one of those who need to pay attention to who really did this is really about a signal and noise problem with all these things going on. How do you reduce all of the noise? So you can focus on the signal and the end of the day it have to leverage automation to do that. There's just no way that we never Tristan lights to process to use a run book to correct problems to contact the

person to talk to him about to making the change by the time that's occurred. The application has been torn down and redeployed three times. Right. So you need to be able to get rid of the noise leveraging automation so that your IT staff so your security staff. So your SEC Ops your Cloud offs have the ability to focus on that 10% that they need to be dealing with an award anyone active bases. It's over that you really start to look at and say that traditional it perimeter and processes that we've always used in relied upon are ineffective. You just can't

handle those kind of changes its scale. They're still important. I'm not I'm not mitigating the fact of perimeter control and we can talk a little bit more about that about how that begins to fold in but it's really so important particularly is is Brian talked about to be able to filter out the noise and so he knows you and you step back and think about it for those of you who were working those large Enterprise firms is you'll think about at least for me, but the it procurement process are are development teams literally had they knew in their head probably an extra 60 to 90 days in the

schedule. They they committed but they figured by the time things get through procurement backlog of servers making it to the data center by the time they get it in a racket put it in a put up the operating system get it Network. We're looking at somewhere between 1920 days. Mancino's long is 180 days to get procured Services into the Datacenter. It's so those kind of processing is wrong with chili when you stand up something to try to detect it and resolve it that resource may have already built in gone. And I know you certainly experienced that it yeah,

I think the only thing we saw was not just the time it took to provision. I mean that certainly took a lot of time and I never interacted of the server again, but it certainly was just when you were going through the process of working with engineering and trying to talk about the problems going to face when they start to adopt cloud and you're going to that transformation sometimes people don't understand the scale of the attack surface. And so what I mean is the

offensive nature of what's going on out there. So we stopping exercise we do with our Engineers they come on board. We have them deployed a server into a secure environment where the porch light to open the world. And completely acceptable and we just sat like Routan password and login. How long does it take for that box get popped. And sometimes when you go to the exercise, it opens your eyes to the amount of things that are just out there standing and looking and trying to find within me when 10 or 12 years ago. There was an increase in the amount of sophisticated exploits are being

developed that's actually started dovetail down a little bit probably does not necessarily more people around their opening up S3 buckets or leaving databases over the world. I know that hasn't happened to spend time flies when you can just scan and find a way in and so that's part of this equation is not only understanding does Security Professionals what scale looks like internally, but also training engineering organization about what's important about security how they need to deploy how they need to think about people trying to get in because if

you can help teach them as they go through this process. Everyone's going to race like I'm going to get better at to make it faster and more it is What do you think about your own SLA is right. I mean, what would be an SLA from you do a typical event in a Datacenter to when you're going to respond to it? How much data could be lost if you had that, you know that the cloud storage open to the world. Those are the things to think about. How do you begin to to really think about it from a remediation stamp? I write Sofia. The First off is it needs to be near near real-time, right so

should go around. It's really starting at a harvesting point. So utilizing all the access points is apis across all those resources and harvesting them back real time. Not only upon creation but actually upon change that day to drift also think about things might have been great as you just pointed out the cic to change but what happened after that point with that engineer, I honestly don't think I like what I really don't believe people intentionally do a lot of configuration mistakes they do but it's that middle of the night they to us when something's wrong, I'll change that back as

soon as it's resolved and it doesn't get resolved doesn't get slicked back. Set up you first got a harvest that dated back in then you want to unify it so that is consistent across all of your individual accounts. All of your VP sees all those resources are than normal ice into a single data plan. Then you want to drive analysis against it. So as you thought about establishing those compliance and security policies of what does it mean to our organization to be compliant? That's the analysis that gets done real-time against those resources and then being able to take action. What do I want

to have happen when this occurs? So it's that if than this scenario if pork 22 is open to the world. What do I want to do? Who do I want to wake up? What immediate action do I want to take not only to protect the company but also from a forensic perspective as well as to learn right? Was it the team that inadvertently did it to resolve an issue or where we actually breach. So all that date is captured and dumped off for Analytics. Google this is absolutely the right way to go. I'm super excited about doing the same thing on top of kubernetes kubernetes is going to be the element

to break down the barriers and come out at sizes infrastructure today. It's going to be really important of the Enterprise organization perspective me looking at the intersection of layer that you can have a unified because you're going to have Engineers here using as you're going to have Engineers freezing gcp you going to have a drink using Amazon and you can't build policies there going to be just living in those worlds because you're going to forget about them or they're going to serve die on the vine or in different ways go on and if we do and security it doesn't matter if you have 95%

coverage that 5% the one that's going to get you. So you need to make sure you have a great holistic strategy and holistic policy As you move move forward. So how do you do that is to think about dealing with this using remediation one is a development environment. Your remediation might be slightly different than as your production environment environment interior. If you want to do some latency testing, I work for a big bank. We are not allowed to have servers outside the United States, but you want to do some Lindsey testing

something about environment. You can spend in Tunisia pack for the next 2 hours 2 hours later systems an automatic and come back and clean it up and make sure everything's okay. But when you move that same application to staging you may not actually have that ability to do that. You might leverage more faster remediation a summer comes on a z-pack. It's killed instantly, right but you still want to let them have that ability to try new services and do more things right? You don't want to lock him down using preventive controls cuz you need them to go in there and try new things you need them

innovate and if you block them at the light at the top layer, they're just going to go around you the bill create an account by themselves, and that's the worst thing you can be in been compromised. Uncompromising not knowing it is a worse right? So Embrace this help them innovate help them learn goes to that process with them and leverage remediation and real-time. He will provide flexibility about how they do that. But then when you get to production, this is where you may want to leverage some preventive controls / is today provide different ways to do preventive controls on lock

down certain Services being used. So is you taking your engineers through this journey, right? You want them to each stage understand? It's a little bit more stringent a little bit tighter till it harder to do. What you going to do for the outside the parameters of what we've improved going to get production. It just doesn't work. Right but they're not surprised by the time they get there because the whole way through the Journey you've been teaching them and what's more important about that is not just about enforcing a policy and then running away right about engaging them. If I bring

into the conversation say, what is it? You're trying to accomplish. What are you trying to do? Let me help you find a secure way of doing that help them in a bit and help them along that Journey. We are jacket. I didn't think about it too as it's this as he's mentioned is normal of restriction. You you were providing guardrails at a much wider in that early stage 2 to generate Innovation about the time. You were a stage 3. It's it's at least privilege Rite Aid. In fact in many cases. It's just going to be machine only privileges that are enabled in production to be able to talk to run

those services. So I will talk to you in the next layer about how you get there about these different layers and Grand up two more coarse grain, right? So when your mood for this is where you're leveraging it real timer mitigation to go in and clean up after things or something's down or clean up security groups. Whatever might be right identify database because it has not been connected to in a long time. All those different elements are going in and clean up fixing that you're protecting your I'm on a regular basis for implied checks. This is your ability to take things

like terraform or cloudformation templates or anything you need to work. West to be able to deploy into your environment provision baby helmet chart are you able to pull into that infrastructure and have the engineers integrate with a tool that will allow it to check those things as they're doing it? So when they going to the CI CD process assistant, I'm about to build these 10 resources is what it looks like is this okay right have the CI CD process than either pass it and say yeah. You're allowed to do as soon as a developmental tell you this is the problem. We're having just straight

fail to build that you want in a great and bring security into their world. Not the other way around again, if you try and do that, there's going to find annoying go around you. How do you how you bring it in? Right so is imply checks are really important and then finally decided as you permission accounts do in an automated fashion, when you do that for projects or teams or whatever my you go in and start slapping controls around it at provision time, and it's might be a mixture of remediation and preventive measures you might for some sort of ability to do a CI CD pipeline.

Frankenstein chapter that goes all those or tighten things around you go and do more production in preventive accounts combination, right? So those those sprained or those big mindsets to say these are never you never too young to be violated. If you will down to that midbrain word as if Brian talked about you may put a warning there in that Deb cycle, but you're not going to shut it down immediately cuz you also or are facing into that cultural shift, right? So you also want to educate engineering as to why we are going that

direction down to those fine grained controls that not only take care of a faun launch but really that drift that can occur day-to-day 30. So those really combined with some of things that we talked about in the previous slide around the cycle aspect gives us. So for those of the leaders in the room has that ability to become more the department of yes to drive Innovation for your company versus the department of no But I think the high-level key takeaway here. Is it as other one goes down this journey, you need to define a strategy for the organization is not a

b start out talking about that is not just a technology problem. This is how a business is that transform when I was at ye and we are building games of cloud didn't just change how we to put applications. It literally changed what application should be able to change what games we took the market exchange switch games. We decide to stop golfing on cuz we will do it faster. Right? So it's a huge business transformation so she could go to this process you decide how security how i t how Cloud off is going to dress. This is important to think about as a holistic strategy, right? So we talked

about that those layers all the way from development into production needs to be taking consideration and how you engage your engineering staff and teach them as they go. Absolutely I think is Brian mentioned earlier around filtering out the noise. We can't rely on Traditional just the perimeter security control and just providing notification. Okay, it's great to know that there's a theft happening and I'll five but have we filtered out to the noise to know exactly where and pinpoint what what is happening where it's happening and how to resolve it and then be able to take that action to

remediate it in a time of cloud speed. The other somewhere out there on a regular basis working a large Enterprise customer solving for the Strategic problem. And one of the things that we found is the company's attend a demo success really get moving the quickest established cut-offs wait to witness. This is her first name because we talked about security. There's just desire to think about traditional infrastructure security. This is about analyzing Network traffic and identifying external threats and come out of the preventive measures to react to those threats, but the security

from her face right now is different than what we've seen before I used to do exploit development rationally for a while and so from the offensive side, you thinking about having slightly different to think about how to get into a locked box from the security side when you're defending against that you're looking at traffic to try and figure out what before throwing at you. What are they know about you that you don't know and so on so forth when you're dealing with the cloud off side effects and Helping Hands near to grow. It's much more about understanding what they're doing and what

their needs are making sure they don't make mistakes. It's an internal threat. It's very different because you and then we're on the same side. Hayden feed a fighting one another so you have to find a way to embrace that and what we've found is by establishing a cloud Center of Excellence the cloud Ops Team that's going to be focused on security from a cloud perspective and what that means the internal organization means you get a lot more Innovation a lot quicker. My experience has also been is that having that football team at Cloud Ops Team also helps as an accelerant not only from

adoption perspective but also from an educational cultural perspective across the entire organization as as folks begin to transition out of that data center mindset. The many cases you going to be a mean it's been together that most organizations of that size are going to always be hybrid. They're going to have their data center with their large Erp systems and others that will remain in on on on Prem but to be able to manage that mindset across-the-board it helps to have that cloud Ops Team. Absolutely.

Cackle comments for the website

Buy this talk

Access to the talk “Self-service Your Cloud Through Automated Remediation; Without Losing Control (Cloud Next '19)”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Access to all the recordings of the event

Get access to all videos “Google Cloud Next 2019”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “IT & Technology”?

You might be interested in videos from this event

September 28, 2018
Moscow
16
173
app store, apps, development, google play, mobile, soft

Similar talks

Mario Ciabarra
Founder & CEO at Quantum Metric
+ 1 speaker
Stephanie Wong
Developer Advocate at Google
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Ines Envid
Product Manager, Google Cloud Platform at Google
+ 1 speaker
Pere Monclus
VP & CTO Network and Security BU at VMware
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
William Anderson
Director of Software Engineering at Forbes
+ 1 speaker
Vadim Supitskiy
VP, Engineering at Forbes Media LLC
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “Self-service Your Cloud Through Automated Remediation; Without Losing Control (Cloud Next '19)”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
647 conferences
26477 speakers
9839 hours of content
Thomas Martin
Brian Johnson