Events Add an event Speakers Talks Collections
 
Duration 40:01
16+
Play
Video

Your Metrics Suck! 5 SecOps Metrics That Are Better Than MTTR

John Caimano
Global Practice Director at Palo Alto Networks
+ 1 speaker
  • Video
  • Table of contents
  • Video
RSAC 2021
May 20, 2021, Online, USA
RSAC 2021
Request Q&A
RSAC 2021
From the conference
RSAC 2021
Request Q&A
Video
Your Metrics Suck! 5 SecOps Metrics That Are Better Than MTTR
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
73
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speakers

John Caimano
Global Practice Director at Palo Alto Networks
Kerry Matre
Sr. Director of Product and Services Marketing at FireEye

John Caimano is dedicated to making security operations organizations as efficient and effective as possible. As the Professional Services Global Practice Lead for Security Operations at Palo Alto Networks, he works to increase automation within SOCs and up-level SOC efforts to address more sophisticated threats. With almost two decades in security, Caimano’s in-depth knowledge of security operations is extensive. He used this experience to co-author the Elements of Security Operations book which breaks down the elements of a successful SOC.

View the profile

In almost two decades working in and around SOCs, Kerry Matre has seen some things. Some good. Some bad. Some very, very ugly. Having started her career in application development, she jumped into ethical hacking and learned the fine art of sql injection. The result was pure paranoia and an instant desire to burn every piece of code she had previously written. She has seen what works and what doesn’t in over 150 SOCs and used those insights to co-author the Elements of Security Operations which is a product-agnostic look at all of the components that go into making a SOC successful. Matre is currently on a mission to determine the best metrics to use in security operations. Hint: It's not MTTR.

View the profile

About the talk

John Caimano, Global Practice Lead, Security Operations, Palo Alto Networks Kerry Matre, Sr. Director, Mandiant Services, FireEye Good metrics are elusive in the world of Security Operations. Organizations often fall back on reporting fit for network operations that can incentivize bad behavior. In this session, we will explore the purpose of metrics to give the business confidence in the services the SOC provides. Metrics that matter go beyond red/yellow/green charts and can drive change. We will share them with you.

Share

My name is Karim a tree and I'm here today with John,. I knew they were going to talk about why you're metrics suck and I know that's a bit abrupt everybody knows that security is a serious business, John, and I have spent about 20 years each in security, and people are always asking about measures complaining about metrics and what we've noticed over the last 20 years, is it there's good metrics and very, very bad ones. So, you know, we don't know your environment. We don't know what tools are available to you. But we'd like to do today, is talk about what makes a

good message. Give you some examples of good metrics and keep it kind of informal Insurance in stores with you, from What I want to start off with his just level setting on what is reporting and what is metrics. So you can think of reporting as just activity. We're reporting on activity. Reporting is things like number of incidents handled. Maybe number of high-severity the channel number of analyst working in stock or hours that they were that we reporting reports on activity, but

doesn't drive change metrics. On the other hand. These can guide the business metrics are staying that can tell you. What's the next Ashley? That needs to be taken to know is your business doing well or your operations set up to protect your business, or is there something that needs to be made? So reporting his activity and metrics is something that can actually drive activity in your organization. That's the key points that we want to get across is metrics that matter, provide confidence and drive changed. And this is something that you are

friends and treat me to book that we wrote together called elements of security operations. That John and I put together what this means is that I need to provide the business confidence that you are providing the service that you say You're going or if you're going to need to be truthful about that too. So the business knows that you know, your starting point. The other thing is that they can't just be numbers that are flashed on the screen, that doesn't mean anything for someone. So the metrics. Can you provide me to actually provide or drive chain in the business?

Now, one of the things that I want to talk about it, you just sometimes people are afraid Adventures. There are afraid of the stories that they might uncover. They're afraid that matters. Might make somebody look bad. A lot of times metrics are kind of hard to gather. Sometimes they just leads to more questions. So, what I wanted to help you understand is that don't be afraid of metrics, you know, they might get in the way of a story being told. What was

that? Tells you that that story is not one that needs to be, told the business needs to know the truth. They need to understand that confidence. And so don't be afraid of metrics, just provide that for business. I'd like to add on to that. Absolutely, Sol. I have a personal story on this in my personal life, not even inside of Security operation. But my wife and I we have four children and at one point in our lives, they were very little and they're young. And we, we hired a house cleaning service to help keep up with the house

cleaning and keep our house, right? And clean and over the time that we had them, where they were, they were keeping a house clean. It was a superb, it was great. And because we weren't measuring any kind of metrics or looking for any reports on how well they were doing, is keeping our house clean. We started my wife and I started getting a false sense of security that we were doing a really knock of job that we were actually keeping the house clean on her own and that everything was was fine. So we wound up letting go of the, the housekeeping staff at 2

to no longer come in. Weekly do a cleaning and we learn quickly that the house cleaning stuff was actually the service was actually doing the job and was keeping our house clean for us. And as soon as we let go of that are our house became a mess again. And we quickly had a rehire. These are this is why and security operations. It's it's supremely important that we go ahead and collect those right metrics and reports because if you don't, you're going to wind up with a mess inside of your network and inside of your operations. And at that point, unlike us, who it was just a

couple extra bucks, for us to get the staff back out here and start cleaning and you know, maybe an extra cleaning or two before we are back on track. If you mess up and security operations, you're talkin about a breach of some kind of ransoming, Destruction or exploitation of data. And at that point, you're you're losing confidence in the market. You're losing confidence in your product and that reputation. Damage can be irreversible. So it's really important that we were talking about security operations that we do collect that. Those are right reports and those right metrics so

that way we know exactly what's going on. And we don't wind up cutting back in areas because we had this false sense that we're doing great and we don't need to spend as much in this area as we had before, right? Yeah, you know, there's two reasons, why security organizations are quiet, things are going really, really so those metrics to the business. Again security organizations are providing a service to the business, the business needs to understand. So

again, that's what's the matter. Provide confidence and drive change, will say that about 20 more times before the end of the recession here. Even picked on and CPR in the title. So John wants to take us through. Why me? Time to resolution is Alex about 4 days and I'm bored too many people out there, but it's just, it's the wrong metric. When you're talkin about your analyst or your engineer's, it drives them to make quick decisions and rapidly resolve issues. That's the wrong behavior. And security operations in when you're trying to keep

servers up or the network passing traffic, or just anything, that means keeping blinking light blinking. It's a great metric because it drives up time and it drives that response spoon. What nttr doesn't tell you is anything about security operations. Resolving tickets takes time, especially if you want that High Fidelity. True, true. Positives, and true false positives. When you're going through that, it takes even more time. You want to have confidence in your decision and confidence and how you Are are handling those events. You need to allow your analysts to bill

the story and build it properly. Define the who where why what when and how and really put together that story if anything is missing from that you wind up with a lower confidence inside of your true positive and false positive results and it can lead to a compromise cleaning onto a foot hole and a foothold becoming a breach, in short order analysts need to feel supported and they have the proper time to investigate a ticket without a ticking time. Clock, when, when mtcr is allowed in security operations. It really just drives the wrong Behavior. It's

one of those things that we really need to understand and remove from socks. Because we don't want people cherry-picking tickets because they know who I've done this before. I've seen this before. Let me let me grab this one because I can resolve this quickly and make sure I stay on top of that leader board with a very short mppr time. It doesn't lead to a robust amount of knowledge in your sock because they're going to hover around things that they know and they're confident that they know, and they can answer really quickly.

Yeah, and we're going to get pushed back because every security product has mttr in their dashboards in the reports in. And we're not saying and CPR is is useless. It is to be used for reporting. It can be used to assess if your automation is working. It's just not a good metric, measure your heiress by because he drives the wrong Behavior. It pushes them to skip through events. When they do that. They're not doing full analysis. They're not bringing those controls back into the or bringing what they've

learned back into those controls. So you don't mpcr is not useless. It's just not a good metric when related to animals. So, what is good? No, we like to break our mattress up. Into two things, a business wants to have configuration confidence and they want to have operational confidence. We think that most if not all metrics fit into these two buckets. So what are they is knowing that your tools and place their configured the best practice and they're going to be able to

prevent an attack or prevented breach of attack is ongoing and it mitigates or provide enough intelligence to an analyst be able to mitigate any incident. So that is about having the right tools, the right automation employee, have the right thing. Aboriginal confidences. Do I have the right people with the right process has to be able to use those tools, those capabilities in the event of a breach. Or when they need to do analysis, investigations. Are they trained properly, you know, if not, then that we need to

bring a third party for help. We need to hire more. These are types of message that can actually drive change. They can tell you what to do next within your business. So you don't configuration confidence is what we focus on a lot in. We focus on tools are tool set up. How many tools do we have? How many rules do we have written? You know, that sort of mindset is bacon to security. And what we are trying to do is open it up and say, no. It's also about how you use those tools. Don't forget about people

analyst process because that's how you're going to complete the, the incident response process. So the number one, John wants to talk to us about analyst activity. Absolutely. So one of the best things for one of the first things we talked about when we talk about an illicit activity, is the events per analyst per hour. Etah. It's a great metric for understanding. How overwhelm your resources, are 10 incidents per hour means your analyst Orwell stabbed while 104 hour means they're overwhelmed and they're going to miss or ignore

alert. Monitoring The Pecan Valley. E t h x can really assist in proper Staffing models, for example, knowing that the amount of EPA EPA a doubles from the hours of 8 to 10 a.m. When people are getting online and starting to open their email and allow your Staffing out of Model to overlap shifts during that time to ensure the proper coverage of the second thing. I like to look at when we're looking at an illicit activity is handling time for For alert Prestige, / analyst monitoring. This will reveal your team's

strengths and weaknesses where process and visibility is well-established verses where it needs an improvement. When comparing individual analyst. This stat will help indicate work and at least need additional attention specifically by the stage of incident and the alert side shock managers and directors to look for where the analyst have opposing struggles and have them work together in a peer team or a create a pod with a scenery well-versed member in the other types and a junior or to Junior's for the form and analyst team. And this symbiotic

relationship. When you start, Generating the types of teams where there's a weakness and strength helps them learn from each other and really helps that operational confidence that you were talking about earlier because it gives them a good sense of how they can ensure that the processes that they're completing our leading to those true. True positive in true false positive results incidents by severity. This is where the time is being spanked is a sock, receiving a significant amount of critical and Haylor.

This can lead to this. Can indicate configurational issues. It goes back to configurational. Confidence is a sock overwhelmed by the sheer amount of low and informational alerts that indicates an operational issue. Do we need to reduce the amount of Thieves to get the alerts to a manageable amount or can they slow and information alerts be implemented into a threat hunting program? Things should be directly involved with the sock and the results should be on very short Sprints and feeding that information back to the security operations team. So that we were putting in the

right strings were putting in the right controls and we're making sure we're getting the right alerts to Basalt with the rights of every and then severity update. How often are we missed classifying or alerts based on that initial enrichment? Are, are we going to hire? We going to low this really indicates a failure in that identification process and that we're just not getting enough of enrichment or we're in reaching the wrong artifacts. The all of these things will really help increase your analyst activity and give more configurational,

and operational confidence back to the organization. Yeah, just a quick thing on the events for analysts are of the talks about you know, if it's too high, your Alistair going to ignore alert and there was just a report put out by evangelist, ignore alerts. When they're overwhelmed that, skew it happened to Good Measure metric to understand and they can drive your drive the business. It's one of those things that I've seen and way too many socks and it drives me crazy when I find out that they're ignoring alerts. There's an alert. It's

something worth looking at. The number to our second bucket of things, unless you think about his hygiene, hygiene hygiene is efficacy. If you're something to look at our, do you have unused rules? And why are they on use? Is it because a different control is it has before it hits. Those rules are the rules written poorly. Do they need to be updated? Are they, are they 10 years old? And they're just been passed along from device to device if you upgraded your control. I never Revisited. So, you know, to look into your

number of unused rules because I can drive. Hey, we need to do a refresh. Maybe we need to get a look. We need to look at this because it's going to lower our administrative overhead have Lunchables. The other thing is people control. So, if you see duplicate rules, again, it probably means that there were many hands in setting up controls. Maybe they were named different naming conventions and ended up with basically the same rule again, this increases your Administration overhead. So by knowing the number 6 that

when's the best time to do those things when you doing a refresh, keep that in mind, if, you know, when you have these hygiene issues, you should be taking care of them, all of the time. Most organizations are not to take the time when you're doing is technology or upgrading to a new kid on the bottom, 10 alerts, you know, what are the top 10 and do they make sense? What are the bottom 10 and do they make sense? You know, if you have if you have it or not firing, why is

that? Okay, so so look at those but also, look at the bottom 10, you have something that's only, you know, firing a certain time. Does nyse's for whatever. Look into those. Those might be your best trying to sneak in the top 10 can actually indicate false positives areas to really look into and it finally, look at the number of Technology. So, if you're having to tune your endpoint solution, constantly, what's that going to do? Well, that's an increase in risk and

Beyond. It could be grade performance to things that you don't want to do. So, look at the number of pins for technology and then you can kind of right? Like, hey. This technology is taking way too much training activities are for, is a higher-cost. You was administrative Lee. We need to do something about that. And so that's that's what I have for the hiking world and can carry and I agree with you 100% on that. It's one of those things that I've seen way too often throughout my career, especially when I was at mssp where you wind up, plug

and, and play, and just replace. So you're not taking that new technology. You're going from an axxis list based firewall to a UTM firewall and we're just migrating those access list in from UTM and not really turning on those UTM features or you're upgrading from UTM to next-generation. And then, again, you're just still using the same access list base rules now on a next-generation firewall. These are one of those areas that really need looking at and you need to start understanding that hate. We got this technology. Reason, let's get the most out of it. This

also happens. If you do your incident response process, right? And you have a continuous Improvement face, because when you get to that continuous Improvement, you're going to look and see what capabilities could have prevented against this which should lead to more hygiene of of your rules to make sure that you're fine tuning those and making sure the policies are blocking the traffic you expected to be walking into a number 3 is realized I'll use the category that I put this into

the business. They are making investments in security Technologies, 3D capabilities and the business needs to have confidence that they're they're getting out of it. To look at the kind of proved that to the business is number of features used. So like John just spoke about a lot of times you need to upgrade from a five-year-old technology up into the greatest latest and greatest. And you all you do is convert your existing rules and you don't turn on

URL. So you are not getting that value that the business thought they were going to get renamed best in this technology. We have seen where you do audits of different features. I have seen Enterprises that are using about 5% and certain. Devices that we're looking at 5% for an Enterprise company. I mean the amount of unrealized value there. Is it in men and if you you know, that may not be until the business with the business needs to know that so that they can provide more shopping, provide more,

whatever you need to be able to get the second thing is percentage of visible traffic. And I wrote this down. There was the recent numbers. I found some 40-yard Labs were 85% of traffic is now and Chris's business needs to notice. If you are not doing any sort of SSL inspection, you're only looking at 15% of the traffic really, the business should not have confidence biting, a good security service to them. So that's something. You know, you can price the business. Star piece in there is the use of

threat intelligence leaves. So we as an industry, we use free threat intelligence. We've spent a lot of money on what the business wants to know. Where is all of that money. We're spending turning into abuse cases. Are you learning about the actors and figuring out what you need to go look for in our environment to make sure this year you're protected against it. It's not just you. It's what did you do with it? And then, you in a backlog deployed technology. So, coming from a bunch of Hardware companies.

There's a lot of Enterprises that will make large Investments on endpoint. So the business has spent money, they expected value out of them, but because of lack of Staff because lack of time because lack of expertise, they're not getting deployed. If you can report on. This this drive a decision by the business, this can drive the business to say, okay, here's some more funding so that you can go get what you need to get this done. So leave it at that for realized value in one of the important things that you have listed here. Gary is the

percent of visible traffic and we can't the two of us, can't stress that enough. One of the Cardinal rules inside of a sock is, if you can see it didn't happen. If there's no log, there's no way to prove it. There's nothing you can do about it. There's no way that you can prevent. There's no way that you can work on it. So really knowing what that number is and sharing that up the pipeline. So that way you can show days why you need to adopt more technology to get into that invisible area or more Staffing to to handle that or more automation to Cipher

through what you are getting, cuz you're drowning. Those are things that that that need to be reported on that. They need to have a metric or so. That way you really can get value out of your security operations, team, cuz if you're only looking at 2, 10% or 20% of your traffic, you're not getting a lot of value out of that team. All right. I'm going to jump to the next month. Actually. I'll let you know. This one process deviation. Process deviation. This is another one of my pet peeves and there's a couple of ways that we can measure that. So the first one is sock process and procedure

deviation monitoring for the consistency and alert handling this. If you have consistency and alert handling to increase the quality and the confidence in your Al comes a complete shock process should include four phases. I spoke about one of them before but you should have the identification face. You should have a full investigation phase then you should have a mitigation face and then your continuous Improvement. When these steps when steps in any one of these phases are missed. It can lead to low patelli results. That's when you get your false false positives or

false true positive, in a false sense of confidence. What should be the bane of all the soccer teams and it's repeat alert due to the Mist prevention, or missed automation opportunities. The second thing you should be looking at Meaning of automation, opportunities is automation failure. You need to be reporting when when the automation is not working. This, this could be missed alerts. This could be alerts that are missing and Richmond n or the context. The contextual data is insufficient in the remediation process has all of these failures will cost the stock of valuable time. When

manually completing this process of artificial intelligence can, and should be taken care of. We should be looking at where we're missing because the you have automation to save time and I can run into many cases where automation is not doing that, number four, or number three, training for consistency. Sometimes processes fail because of lack of analyst training on the job. Training for new analyst doesn't work need a. I need a training methodology, and well-documented processes is allowance for consistency in the stock that can you can reproduce. Process

deviation. You can track this by capturing metrics on how many analysts have completed each of their training modules from there. You can also track how long it takes to bring up a analyst to a level of self-efficacy self-sufficiency. If the length of time is 6 months, you can create a plan to shorten that or whether it be additional training, documentation or simplifying. The procedures over all his certifications and continuous learning the certifications matter. Yes, they matter too, especially to an MSP. But when when you were looking for a showcasing, how many

certifications a certified resources you have? But beyond that, they're important to establish a Common Language throughout the security operations game. When everybody speaking the same language it helps for them to know how they're communicating. And when I say alert when I when I say event, when I say compromise and I say breached that we all know exactly what I'm talking about and when you have Different levels of understanding, a different levels of knowledge there a wind up being more difficult process because you're talking to different languages. Even when you're using the

same words on a story here to of the training inconsistency, on those organization night. I work with that. They're their security analyst. Went all in on a lottery ticket, you know, pool put in your money by a bunch of lotto tickets. If you win, y'all win, I'll guess what the old one. So I probably left there. Now they were on alert because they did not have any training formal training in place. They did not have any of their document processing procedures. It was a mess. So is that going to happen to anybody

else, managed by your organization know? So yeah, so another key area for your analysts. And the first one is the distribution of alerts for analysts. We want to make sure that well process Sox will process Sox. It should say, should not have analyst cherry-picking alert. The the alerts that they know that they can be completed in short order. We spoke about this a little bit earlier when stocks don't monitor this behavior that leads to not well-rounded resources in order to function and in order to function properly. Now, it requires Heroes and we all love our superheroes in the TV

movies and comics, when we have heroes in our socks. It's not a good thing. It puts too much, Reliance on a single resource or a few resources and it creates a gaping hole in the team when they win the lottery has Dairy just mentioned that we want to make sure that we we distribute and we have a great round robbing other lured. So everybody's getting these Everything about the source of alerts. We want to look at what we're all of our alerts are coming from understand what tools are generating. The majority of rollers. How often are the alerts malicious versus

non-malicious? What tools are the ones? The analysts are spending the majority of their time. Using all of this information helps keep those. Those are the tools that need to be in use in use and when you have it under used utility or tool it'll help you understand why it's not being used and whether or not you need additional features or different tool altogether time spent according to the 30-30-30 model. This is Keith. What I love seeing as when analysts are doing 30%, I are 30% hunting and 30% Project work. When the analysts are part of all three types of

these activities that keeps him engaged. It reduces the ability for console burnout and encourages growth. Learning opportunities. When they, when they have the option to Pivot and a ship, their mind mindset from an active incident response to it really helps open up them in finding, you use cases in creating better information and enrichment into that in to the sock and make sure that the worst that you come in to be handled at a much quicker pace. And finally team structure. Sometimes the stock is only one person or a couple of part-time resources. This is

an important metrics to understand that the part-time resources candle network operations in the responsibility. And their priorities may not be with the business Express, Express this massively affect the 30-30-30 model. I just spoke about also consider how the team is structure. Traditionally, receipts here, one analyst, who handle triage in handled. True incident, two, tier 2, or additional research. With the more we are seeing the newbies or the junior member is teaming up with a C. Analyst in teams, the more we're seeing a collaborative environment and a lot more growth in that because

you have the the better knowledge share in a better collaboration and cohesion on that security operations team. The other thing that we we we want to look at metric-wise as we can track. What kind of backup staff there is for when we have people that take vacations or sick or in covid, quarantine. This can help the business. Then decide if the team sizes, acceptable, or the structure needs to be rethought to feel confident that they're ready to go or that they need

to make a change. So let's wrap it all up and give you ideas for a lot more than that. But I want you to do is go back to your organization. Take the idea of what is reporting. What is the SEC tivity? And what is metric? What is something that can actually drive. It changed the business with those out and figure out what's, what do you have explained this? So what? So if you're going to have a metric that matters is going to drive chains. You should be able to find what changed that metric is going to your house. Will there be more analyst? Weather be saying

our automation is working. It's great, you know, providing office. 3rd, August be capability utilization of your tools. I should have written that in a way that I could say it easier to figure out how much you might be shocked to find that you're only using 5%, you know, of your of your capability, you might be shocked know that a certain part of the business is not doing inspections. So they're not with the traffic figure that out. What what are utilizing? What's the realized value to

the business? There's actually, you know, that there's vendors that can do this for you within their own tools, the third-party services, you can get, then come in and look at your utilization, but that's what the business wants to know. Are you using the Investments that I gave you? When it comes to the business, make sure the business understands, what service you provide them identify was mission statement. What service deliverables are you going to find the message around that

I wrote here at exactly what measures are most valuable to them. But if you've gone through this, if you've gone through these test 1, 2, 3 and 4. It's not an ask, it's an educated and I tell you, should be able to have those measures explain why the business needs to know them, explain why it is providing us the organization and in the executive take that in this actually gives me confidence about Are protection or give me directions on where James needs to be made? With that. Thank you and have an excellent conference.

Cackle comments for the website

Buy this talk

Access to the talk “Your Metrics Suck! 5 SecOps Metrics That Are Better Than MTTR”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “RSAC 2021”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Similar talks

Kelly Shortridge
VP of Product Strategy, at Capsule8
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Wade Baker
Collegiate Professor of Integrated Security at Virginia Tech
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Brian Robertson
Expert in Product Messaging at RSA
+ 1 speaker
Megan Horner
Senior Product Marketing Manager at COFENSE
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “Your Metrics Suck! 5 SecOps Metrics That Are Better Than MTTR”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
816 conferences
32658 speakers
12329 hours of content
John Caimano
Kerry Matre