Video
DevOps at Microsoft - "Enterprise transformation (and you can too)" ‐ Donovan Brown
Available
In cart
Free
Free
Free
Free
Free
Free
Add to favorites
164
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speaker

Donovan Brown
Principal Cloud Advocate Manager of the Methods and Practices Organization at Microsoft

Donovan Brown is a Principal Cloud Advocate Manager of the Methods and Practices Organization in DevRel at Microsoft. Before joining Microsoft, Donovan spent seven years as a Process Consultant and a Certified Scrum Master. Cloud Methods and Practices are his thing. Donovan has traveled the globe helping companies in the U.S., Canada, India, Germany, and the UK develop solutions using agile practices, Visual Studio, and Team Foundation Server in industries as broad as Communications, Health Care, Energy, and Financial Services. What else keeps the wheels spinning on The Man in The Black Shirt? Donovan's also an avid programmer, often finding ways to integrate software into his other hobbies and activities, one of which is Professional Air Hockey where he was ranked as high as 11 in the world.

View the profile

About the talk

“That would never work here.” You’ve likely heard this sentiment (or maybe you’ve even said it yourself). Good news: change is possible. Donovan Brown explains how Microsoft’s Azure DevOps formerly VSTS went from a three-year waterfall delivery cycle to three-week iterations and open sourced the Azure DevOps task library and the Git Virtual File System.

Share

My name is Donovan Brown and I have in my title kind of weird right because 10 years ago devops wasn't even a word. So now we're putting in our titles. What what does it actually mean when I got devops first put into my title. This is something that I labored over and my manager asked me. So Donovan, what is devops then? Why are you asking me I'm sitting here at Microsoft. Like there has to be someone else asking you what is devops took me 30 days taxi go back in and actually decide what am I going to say now? I'm very very loud. I predict very well with or without

a microphone button on labor for 30 days. And I said, okay what is devops really mean? How many remember a little company called Compaq computers? Is anyone remember compact man is an old company in code for over 20 years and I started back at Compaq computers and I remember back then whenever I wanted to deploy software. We didn't do any of this stuff. So why are we doing it now with the question I first answer why is it important? And then I set off to ask you to find it with my manager asked is what we have now

published in books on our websites everywhere. So it's obviously I think I did a decent job, but what I thought was interesting is in my head I had this vision of what devops really is and what you can do for your company and recently on Twitter. I tweeted this video that I think really encapsulates instead of just shows you what your company can look like. If you implement that boss fight we do this video, but what I wanted to share this video with you right now and we're going to do is kind of help set the stage for what we're going to talk about. What your company can look like

before and after Implement Stella? When I saw that video like that's it. That's what everyone should be striving to do. So I tweeted it because that is the number one way to get ahold of me. If you want to talk to Donna and brown or anyone on my team you just tweet at us and we will respond to you. So I tweeted this video and I said before and after the majority of the people got it, but two people said there was no value in the second pit stop at all. And I just

could not understand this. I'm sitting up in my office and I'm reading these comments and clearly I'm having some type of physical. Reaction, cuz my wife watching this is what's wrong with you come curling money to my thought. They just don't understand. How do they not see that? This is an amazing analogy. This is a perfect analogy for what devops is today for real do we can talk about these two gentlemen here their faces. I didn't want to embarrass them, but I didn't delete the tweet. So you can still go find out who

these two people are if you really want to play story. I was in England a year ago. I was telling the story and what are the people when I got to the second person he kind of jumped in the seat weird. I just ignore him and kept talking. That's that guy right there. I actually met him in person. It was pretty cool. But it says there's a huge increase in the number of people in the video then in the first aid it really solve any problems. They just do people at it while I go back to my time at compact when I was writing software there. I Donna Brown as the engineer when I was done with a

walk into a server room with proliant servers everywhere these reproduction service and I Donovan brown with pull out a keyboard typing my credentials and it would log me in to say welcome. Donovan Brown I can do whatever I wanted to do that server delete registry settings. Delete files copy files do whatever I want to I was the poor guy swinging a hammer and a proliant server. So I got it to do what I wanted it to do. I'm back. Then you had to be quick though. So what I would do then I would run out of the room as fast as I could cuz if I got out before the it pro saw me it was his

responsibility. They had to keep the lights on. I just had to get my software to work. It was a beautiful job. If you were developer get in beat it with a hammer and get out in possible. Chances. Are you not have password to any of the production servers? You probably don't even know where the Production Services are and you have to deal with the Auditors you have to deal with security at the Devil Deb's in Ops and the program managers are so many people involved in the points offer today. It's literally works on that level. There are more people involved today than there were a decade

ago or 20 years ago when I was applying software, so I thought it was there are more people involved but I didn't see these people as people to me. They were just interchangeable microservices that we stream together to make sure that we can deploy software faster than we ever have before continuous integration continuous delivery infrastructure as code. That's what I was actually seen anything about it these two gentlemen here. That's security they literally just hold the car to make sure it doesn't Rock off the Jack you need Security in your devops pipeline. Nothing. It's crucially

important is monitoring everything that you do. If you do not monitor it how do you know if you've been proved it or not? You need to measure it now make a change and measure it again these two gentlemen, never move. These two gentlemen are watching a 3-second pit stop and see how we can be a two and a half second Pitstop next time you need to monitor everything that you do in your company and make sure that your Investments make sense. How many of you are you on a day transformation right now anyone out there trying to do this stuff couple of you right the first time you did it. It

didn't go so well never does so don't feel bad. And the first thing that think of it man, I need to figure out a way to roll this back if this doesn't work, right that's like step number to figure out how to roll back. So I think that happened here at one point. See what I'm saying? Here is exactly what happens when you're trying to build a pipeline for the very first time. It doesn't work in your knee-jerk reaction is as we got to figure out a way to go back in there and protect ourselves from that microservices. I only take the tire off finally put the

tire on that's exactly what you're trying to do everyday, but with all these extra people they drastically reduce the amount of time which is what we want to do as well. Let's talk about not refilling the car. I was always taught you fix what hurts most first. If I look at this Pitstop Regal in the car isn't even What Hurts the Most if we refilled the car at the same rate we did and this video in the second video. We still be done in half the time right? Because that's the place we supposed to focus on in my head is they realize that they have to figure out how to change these

tires faster and they get the tire changing down to 10 seconds. Then I'll never another 20 seconds and that's the bottleneck now, you don't go and redo everything you focus on one thing at a time and then they realize man do we want to trick physics and figure out ways to get more gas in the car faster or we wanted to shift left technology and Innovation that allows the car to go further unless make it more aerodynamic make the engine more efficient so that we can a completely eliminate the need to refill the car to anyone. He said you didn't add value could you didn't refill the car is

completely missing the point. The point is we did not have to refill the car because of so many things that we did earlier same thing that when you start doing unit testing you go from swinging a hammer to been able to do this to take the value from the fingertips of your developers and put it into the hands of your users as quickly as you possibly can and delivering value is why we're doing all of this. So This I love to share cuz it kind of put you into my mind space. This is what I'm thinking about every single day when I'm thinking about how do I go from the first video to the second

video with every customer we have for Microsoft. So when I wrote this down, it looks like this. Devops is the union of people process and product to enable continuous delivery of value to our end users. The most important word of that definition is value. We're not talking about shipping software. We are talking about shipping value. Why do I focus on value? Because the lines your entire company on a single goal. Remember I'm going to tell you the story back and why was that compact the it Pros job with the keep the lights on the servers on My job as a developer with the chains that serve as

much as I possibly could what's the easiest way to keep a stable service table don't change it. Right literally do not change it in. The lights will stay green you at Donovan in their trust me at all when I'm done, right? So you we are literally incentivised to work against each other. They get the big bonus. They get the big check. They get the rewards on the lights on the server stay on each other. We're making it very difficult for both of us if we were both rewarded. If and only if we delivered value to our end users would naturally work together, which is why I made sure the word software

did not appear in the definition because when I say software equipment lease with shift over to the developers and not the operations and not quality and not program management and not security. We're only worried about changing the zeros in one and that's a flaw cuz that's not always going to deliver value. Black Friday Cyber Monday to largest shopping days in the world does an e-commerce site have to actually change their software to deliver value. No making scale up or scale out their infrastructure and now something more simultaneous users keep the response time down and sell

more retail. So you do not have to change this offer to deliver value. So focus on value get your team's to align together and they will naturally work together instead of working against each other which we seem to do a lot today the hardest part of this is the people People love doing things that they already are comfortable doing. They don't want to change. I honestly do not want to learn another JavaScript framework. I just don't I learned angular than I had to learn Vue and now there's react and tomorrow matter fact is popping up on right now as I'm saying this being written

that we're going to have to go off and learn and I'm tired of it right? I just want to keep doing what I've always been doing the hard. It is to change the people if you're already number one doing it the way that you've been doing it for 20 years. How you going to convince me that I need to change the way that I'm doing it. It's working right my favorite example to use as Walmart. All the questions I'll get to you for you. I promise for the largest retailer that has ever existed extremely successful company, but they've been doing it

the same way since 1964 and almost the same way in many cases. Been a little company who's heard of jet.com has anyone ever heard of jet.com just a few be right jet.com didn't is not near as old is walmart.com there about a year old. The reason I knew who they were is because a year ago, they were nobody and all the sudden they were worth 3.3 billion dollars 12 months later. How's that possible? Because they were born in the cloud. They thought differently they were doing a job from the beginning rubbing devops on everything and making it better. That's just the way that they thought and they

scared the crap out of people like Walmart in less than 12 months. So what did Walmart do Tell me the only Walmart to do they open up their checkbook and they wrote a check for 3.3 billion dollars and they bought that company and jet.com is now a part of Walmart. If you can't afford to buy all your competition, you have to out innovate them. I was invited to go talk to Walmart shortly after the acquisition. I told him congratulations. That was a smart move, but do yourself a favor. And by the way, I'm a shareholder. Do me a favor. Stop buying a competition and just beat them make yourself more

like jet.com and not jet.com more like you because the next step., just waiting out there to beat our butts, right? You got to start thinking this way. Remember the first video you heard those cars just laughing that poor car every couple seconds is another car just roaring past. That's your competition as you're swinging that hammer instead of doing the things you need to do without in a bit your competition strategic Advantage for you. Even if you're not a software company cuz I'm not a company on the planet that does not rely on software in some shape or form, right? So make sure

Easier than Advantage the process that's the easy part of scrum kanban test-driven development extreme program. We know how to do it. We know how to produce increments of Chippewa software problem is we didn't know how to ship it. So what would I do now is I'm going to talk about how we inside of Microsoft use a product that we actually produce to produce the products to ship everything that we do inside of Microsoft, right? So I'm not going to sell you any more stuff. I'm really going to talk to you about how we do it internally at Microsoft. But when we think about devops, this is what we

believe at Microsoft we try to do this internally, we try to enable all of our customers to do the same. So how did we do it in the world who is traditionally doing things waterfall. We had three years between every single release in the software that we were doing in the past, but we try not to do that anymore reason. Why is this man? He has come in and completely changed everything our culture our business model our profit Everything Has Changed. We open sourcing more things than we ever have in the past and I think people are starting to see that we are a

different company. I remember the first time I got to hear him speak. We have an internal conference called Ready that used to be called Tech ready happens twice a year or half the company basically goes and learn about what's coming out new and Phil Kean of it every once in a while and that tr19. I remember he was going to be keynoting and I said Front Road dead center. So excited to see him for the first time. I've never actually seen or met him. Good at that time. I was a seller. I was selling a product called team Foundation server, which is the predecessor of azure devops so I would

fight when I'd help her customers in a century United States convince him that this is the greatest product you ever seen. You should write this big giant check. I'm going to come in and give you the software was always shaking because I was always afraid to say so how do you use this product is back then we weren't here. We are trying to sell this offer to you and convince you that this offer is great. But we don't even use the software ourselves and we're when the largest software companies in the world. How horrible is that was when I heard him say we have to stop

living this fake life where we would write software for others that we would not use ourselves. It was like he spoke directly to me cuz that's a seller of this product being able to go in and say we use this product would be the easiest way in the world for me to sell it not fearing that they would ask me and then I have to tell him that we actually don't know what he was saying where was team Foundation server not had to be used by the windows team by the office team buy Xbox. If you write softer inside of Microsoft, you were not going to use one engineering system and then

engineering system at the time was what we called today Azure devops. So he is why the company has changed a great deal because we started dog food to you. If I can't do my job unless the software I write work that's offers going to work if I can just throw it over the fence to a customer and I use something else internal that's better. That's not okay, right. So we started dog putting a lot inside of Microsoft and apponequet remember talking to you about in this is very important slide because every time I say, we I am not talking about Microsoft. I'm

talking specifically about the team that builds this product inside Microsoft. Okay, because the windows team is on his transformation, but I completely different stage than this team. The beam team is on this transformation is a completely different stage than this team. So when I say we I don't want you to do but not everyone that Microsoft is doing that. We were all making this transition team. Does it build our suite of products does this is private vs public as your test plan this for your manual testers

in artifacts would be like artifactory or somebody like that. So you need to turn an idea into a working piece of software everything from the workout and tracking the source control the CI the CD the testing everything you need to turn an idea into work and fuses offer. That's what the product have to provide. So when I said it's one engineering system, this is what we're doing today internally at Microsoft. We have 9696 thousand of our internal Engineers already using it twenty thousand of them are just in Windows alone. Right Windows is

bigger than most organizations and customers that I visit just that one team has 20,000 people on it's unbelievable, but we're being able to not produce more than 85,000 appointments per day. We have millions and millions of work items in their millions and millions of bills going off every single day. Again, the majority of these work out of think there's too many of those are just from Windows alone. Right? So the fact that Windows had to start using this meant that every customer we had would be able to use this product as well as our biggest customer happen to be internal right Wii

hacks. You don't know of any customer will ever have that'll be bigger than the windows team. I was just incredible. So dogfooding it really made an important for us, but this is kind of just goes back and said we're really doing what we say or forcing everyone inside of Microsoft to use the same fuel tank. Now this is my journey. I started back over here. I was recruited by a consulting firm called motion solutions back when this was in beta to go off and ship in help implement it it was so difficult to install back in that you literally had to hire our company to come in and install it

for you. Right but when you got it installed it was pretty good getting install was very difficult know what we did. Is that okay? This is exactly what they look like in 2008 and they were going to go ahead and work on a ship another product three years later in the tech industry. We're going to go dark for 3 years and Asuna wake up three years later. If you hit the bullseye every single time every time you put our heads in the sand someone like Jared would pop up and just take over the world for work on him tracking and then Jake and pops up and take over the

world when it comes to continuous integration and then get her pops up and take over the world when it comes to Source control man. We are completely behind the three years is too long. Let's just try two years instead. Write two years of work. No. All right, two years is too long for 1 year. I know we can do it in one year if we only disappear for one year, we'll know it know it was miserable and time that body got on stage and said we're all going to start using this product. So then why do we go back to two years? We actually didn't we started shipping every 3 months and we

still ship the on-prem product because the address they bought sweet. I just showed you comes in two flavors one we host for you in the cloud that's updated every 3 weeks and one that we shipped to you that is now updated every 3 months, but we went from Shipping every three years right over this a lot had to happen inside the organization and ask for when I talk about how the people change how the processes change how the products that we use had to change for us to be able to move at the speed in which we move today and it wasn't just tooling processing people leaving our architecture had

to change and that's something that a lot of people have to realize when you have Legacy software that was built as a monolith in 2005 keeping that every 3 weeks is really really hard to do. But as you started teasing apart into microservices, I think we're up in the air 30 of them right now. You've got to get a little bit more agility in what way that you're able to develop and what API to use at runtime means you no longer have a dependency on who have to go first. When you start implementing features, like feature flags that allow you to

not even have to roll back your software short something bad happen. That's how we got to where we are. But I'm saying all that right now since you don't think I'm going to say if you get these two of them place, you're going to be able to go from 3 years to three weeks. We had to re-architect and we're still react to taking this very day to make sure that we move at the speed in which you want to move. We are moving at 3 weeks for Sprint. That's not fast enough. That's just where we are today. So I like to share this story with people because we didn't do everything right. We're still

learning were eight years in now on this particular transformation and we're still trying to figure out ways that we can go faster those two guys that never moved monitoring it we get 7 terabytes of telemetry a day trying to figure out how we can go faster tomorrow, right? That's how serious we are about it. So, how do we go from here? Back in 2010 we started with Sprint. Number one. We are a Scrub Shop again. The team is a scrum shot. What was really cool about going this from there to be hired in a whole team of experts to come in and Train everybody from the leadership down to the

engineers and that's crucially important to be good at a job, too often. You'll send one person to go get certified as a scrum master and bring them back into an organization that has been waterfall for 20 years and expect that one person who's ever done a job before to come in and change the entire world failed every time right? So what we did it, we know better sweetheart and all these battle-worn scrum Masters to come in and teach us how to do this correctly push back on people who never been said no to before and make sure that we did it right now. We don't have to do that. We have 50

features all across the world that are really good at doing a job. And if you hire someone knew we just dropped them into a highly functioning at all team and they're going to learn through osmosis. We are now actually on Sprint 148-149 right now. These are three weeks long. We never stopped and we went from Shipping you a box to now shipping you an online service every 3 weeks every 3 weeks, but we acted upon by twice a day, but we seem to play twice a day or just hot fixes bug fixes performance improvements not new features, right every 3 weeks to drop new features, but twice a day, we

actually ship which again is it drastically different from every 3 years. If you are a Scrub Shop, you should have something called the definition of done or your. It's very important that your definition of done be very clear crisp and transparent everyone in your organization should know it I would not start if your brand new with a job with the definition of done like this. This is probably hands-down the most mature definition of done I've ever come across this is literally saying that you were not done you cannot say or claim that you're done until we are getting Telemetry from the

feature that you added as it is running in production. That is huge. I gone are the days of a do-over just saying it's cold complete I'm done and then I go and grab something else. Right? This has to be running in production has to be being monitored and has to be sending us to limit racing we did or did not verify our actual hypothesis. Now, this is where you hope to the end up do not feel bad. If your definition of done is code complete right now or a unit tested or your coat coverage has to certain reach a certain level as you go through your retrospective keep tightening up that definition of

done and strive towards something like this, which is basically I don't know where we take it from here. This is probably the best I've ever seen but this is what that team lives and dies by Timber trees crucially important for us to make sure never actually delivering value and not just shipping features. If you don't monitor it you don't know if you're actually doing ring value or not. If you're a Scrub Shop, you should have something called a product backlog is a laundry list of any and everything that you're supposed to do in this particular piece of software. The product owner job is to

make sure that it's in priority order is it We assume it is we take their word for it that it is and we go off and we do the first thing and we ship it and if you're not monitoring are they right? We think they are cuz we have no way to challenge them if they are aren't right or the one that I always love it when the marketing person comes running into my office at stop everything that you're doing. I just got back from this conference. We need to do this instead. Everything. We know we don't have enough time for the developers move Heaven and Earth to make sure that they ship that

feature that does Marquis person said they needed to make sure they could sell and make us rich did it make us rich did make me Rich but I killed myself trying to turn this into a working piece of software for them and they're going to keep doing that to me after Sprint at the last minute come in and tell me how to change everything that I'm doing but we're using her as a developer the next time that happens. Okay. I'm doing this for you. Not going to tell them, put Telemetry in that picture that tells me every single time that feature is used and I'll go ahead and move Heaven and Earth

one more time. I'm going to ship that feature when you come into my office next time. Imma show you how little the last time you asked me. This is being used or not. I'll be able to challenge my product owner when we see that the feature that we just implemented that was at the top of the list got 0 views. Zeroaccess. I'm not saying that we need to put them on the spot, but we need to re-evaluate our product backlog cuz clearly it's not in the right order. We need to do less of that and do more of something else because you clearly don't know what's important where you just cuz nobody

used it. Doesn't it showed me it wasn't important. Maybe your marketing is no good on that particular item baby navigations, not intuitive. But now you actually have numbers where you can go off into have an experiment change the navigation put on some promo code get some Awareness on that feature and see if the number moves or not. If it doesn't that's something that an important feature and we need to go do other things Telemetry is crucially important for everything that you do. So make sure that you start putting it in there. Don't put it everywhere know what question you want to answer

and then put in the toy machine that will answer that question putting it everywhere. You would just get lost. So how are teams structure just said that we have is grunting historically we would have a program management team a development team and a testing team lot of company still have this set up a lot of company is doing pretty good with it. We weren't doing very good with it. Why cuz we have our developers throw untested code over the wall door testers. They were to go off and celebrate with having a pizza party cuz I'm code freeze. Everybody's out there eating ice cream and cake team is

over there frantically testing as fast as I possibly can and then eventually this waterfall of technical debt comes back in the form of bug and then we have to go to a bug back in the ship. Whatever the date is we were supposed to ship right was not very consistent wasn't very good. We also had a lot of automation Engineers over here. So some of the testers were the engineers who wrote automation to test our application for with Fired Up Click buttons verify things work the way they're supposed to And we realize that they took longer and longer and longer to generate those tests. We didn't

understand why so Buck Hodges our director of engineering wouldn't had a chat with him and said what's taking so long generating this Automation and well the code is really hard to get high-level the code coverage cuz the way the code is written as well. You're an engineer. I mean, you're actually writing code to Tesco. Why aren't you guys going in there? And that's not my job is right. The automation is their job to fix the code. So there's this huge disconnect already just like the option to death through our testers and Engineers. They're not working together. So what kind is it? That's

enough of that because if the developer who wrote the code found it difficult to test the code. They were simply rewrite the code to make it easier to text makes perfect sense. But when you have this divided between the to the tested in the developer of a sudden that doesn't happen. That's it. You are responsible for Quality. You are also responsible for engineering. So we acting merge these two together and made an engineering or You are responsible from quality from day one. This means those units has better to get written right does automation test better be easy to write because we

at one point had 27,000 automated UI test. You know, how many times we ran them and they all went green. Zero so why are we spending all this time? Maintaining 27,000 UI test that have never want to run and completely gone green because the developers aren't even listening to this signal anymore. I check and coding some tests spell. Yeah the test failed yesterday. They're going to feel tomorrow. They test fail every day. So I don't know if it's what I just added or something that was already there. I'm not even listening to this noise anymore. And that was a complete waste

of time people need to trust those signals. So we went from having 27,000 automated UI test when we combine engineering team. We we are protected our code and now we have over 86000 unit test that we ride to get higher level II code coverage and we do it in about 8 minutes or so. We've completely transitions the way that we test by Rihanna protecting our code that we can move at speeds that we never dreamed possible before but we had to combine our testing and Engineering it basically you just until after we got rid of

all of her man, too. I do not recommend all companies do that and let you use the product and what you produce you cannot fire your manual tester, but when you have your 500 developers everyday have to use a product that they produce they are manually testing that app every single day. And we do something called Safety appointment take the appointment is where we actually have six different production environment. Each production environment has a larger group of customers on it. Then the previous ring 0 which is the very first environment in which we deploy to is where the adjectives

about team actually works. So every 3 weeks the software that were using wakes up in the point that sell on top of itself while we were using it and then for 48 hours the code sits there while the teams in Hyderabad India in Raleigh, North Carolina and Washington here in Redmond and the teams in San Francisco for 48 hours have to do their job if we can survive 48 Hours of that code brakes at any point. We stopped we figure out what's going on for me to play a fixed and it never sees our customers. What we do after 48 hours and everything is good. We've been deployed to the next ring, which

is ring one. Everyone has nothing but Friendly's in there. We have some are regional directors are MVPs people who are friendly to Microsoft know that they're getting early code who want to get Harley code get it for 24 hours. If they don't report any errors, if you don't have any Telemetry issues we can continue to do this and takes about 10 days for to get through all six rings. Assuming everything goes great. And then finally all the customers in the world new have this new features called safe deployment and our engineering team and team work together to make sure that that happen all these

people here. So we take all these people do we make what we call a feature team is just a scrum team. We have 50 featuring teams across all the locations that I just mentioned. So if you look at the five services that I mention all those Services have different groups of feature team. So one team owns the kanban board one team owns work on Amtrak in a few five teams only work on checking some five or six teams own source control and things like that. So these two Phone features features vertically not horizontally. We do not have a database team and a middle-tier team and IUI team to

really be good at add. Now, you can no longer slice your applications horizontally, you have to slice them vertically so you can actually show value every single Sprint because if your customer doesn't know anything about databases you trying to do three months worth of databases on ahead of time and showing them are diagrams. They have no idea what they're looking at. But if you show them working software after one Sprint, they know what that looks like you're not is it looks like I thought in my head cuz what's so funny about waterfall. Is that what you hear and what they meant to say are

usually drastically different bright you write down all the requirements for 5 or 6 months you go off and you code for a year and then you showed them software year-and-a-half later and it looks nothing like what they thought it was going to look like and that's what a doll is here to fix in 3 weeks. I'm going to show you what I think I heard you and me to tell me in 3 weeks if I heard it correctly or not. What's really cool, even if I heard it correctly. What's really cool? For the first time it's all these new ideas pop into their head while I didn't realize it look like we can do this and

we can do that great put on the station vertically or horizontally. So these feature teams owned the UI. They own the app service tier. They own the database schema, they own everything so they can actually ship the entire future themselves. They have direct contact to our customers. I'll give you my Twitter handle again. You can literally tweet at me and I will add these individuals to that conversation. If I don't know the answer you can reach right inside of Microsoft and get answers to the questions that you're dying to get answers to

we no longer have a barrier between our customers and our future Team all sit together with very very few exceptions where we might have a program manager who's remote, but we try to reduce that a great deal. They're using 10 to 12 people in a room like this. We are in in Seattle right now. If you were to go to the Redmond campus and go to building 18, you can find this exact room. There is a room in this is a room and building 18. The people are all going to be different because we rotate the people around the room. So they're not always picking at the same wall all the time and

they get that kind of fresh look at stuff and I'll tell you another reason why a lot of them changes well, but this room is an entire feature team. It only says about 12 people is not a big Facebook type area cuz you got a Facebook is just like one big floor and there's no walls anywhere. We didn't want that because if you're listening to conversations the person next to you has nothing to do with what you're doing everyday could be a big distraction if you're listening to a conversation in here it pertains to what you do every single day, which is really powerful. Now, what we don't

want is for people to be distracted by too many conversations are too damn doors on them that whiteboard conferencing software. So I'm in Raleigh and I need to talk to someone in Hyderabad I can have that conversation. But if I need to talk to a peer of mine, they are right here in this room. I don't have to find them on Slack. Schedule a meeting with them. I don't have to hope they are there. I can swing by my chair have a conversation and get back to work again. What is time for a daily standup? Everyone just stands up because we're all in your son of each other. You answered the

question your butt back down you get right back to work. It's really amazing the way that it helps is clabbering. Another thing. That's really interesting is each one of these room. We have the teams are there autonomous they can run their three weeks. They want to run them three weeks print we called it a line autonomy. The alignment is a fact that you better be ready to ship in three weeks. If you want to do paired programming knock yourself out you want to practice test driven development go for it. We're not going to tell you that you have to or don't have to do one of those

things. But in three weeks you better be ready to ship and he's one of these rooms and interesting to me because they have their own culture do not allow that they don't want any food in the room at all. Some people wearing headphone some rooms don't allow that either they're silent like a library and they want you to be available to talk without having to hate tap you on the shoulder and frighten you because you're listening to your music. Other room couldn't take a picture of it that way. So the room that I was in we would never be able to put up here and it was an awesome room

number the first day I got introduced to the room I'm walking in and there's just stuff everywhere on the floor and I'm like, this is really weird and I'm sort of recognizing these items are on the floor. So I reached out and I pick one of them up and then I realize it's a Nerf bullet. You don't Nerf guns my gots weird. Like what is a Nerf bullet doing here? And then I'll look up when I scanned everyone's desk and on their desk are Nerf guns, the really nice like semi-automatic Nerf guns everywhere what's going on? And then what happens that team to blow off steam if someone got frustrated they

would just start firing each other. It was like war games in their trust me. You did not want to be the person without a gun cuz you were the one getting shot right? So I had to go out and buy a Nerf gun rights to make sure that I was safe and protected in that particular room just crazy, but it's cool because everyone had their own culture and the team's knew how to work really well together. Stay Together every 12 to 18 months. We give all the engineers an opportunity to go pick up a different team to work on this is extremely powering because I remember working on software for 3 or

4 years. Eventually just do not want that code base again. Are you just don't want to see the same lines of code one more time. It just drives you nuts the what we do is like listen, if you've been working on workout tracking for the last 18 months or two years and you want to go work on Source control and said knock yourself out if you want to go and start working on get clone and rebates and all our lower-level technology go for it. You can move from Team to team which is really nice. So what we do is we have what we call a Yellow sticky after side where the teams do is a 50ft cheerleader

get up and talk about you want to come work on work on tracking for the next 12 18 months cuz it's awesome. We're going to have so much fun. We're going to help our customers blah blah blah and then the engine sticking their first second and third choices and put them on a whiteboard and then we go back in and we balance the teams. 80% of the people go back to the team that they were already on why they already know the culture or they've already bought the gun rights. Like I already have an investment in this team. I know how it works and not communicate. I'm comfortable luckily for us

though about 20% of them move and that 20% as a ton of value to the way that we write software. Those are experiences now being spread across the entire organization, which is very powerful. But if you move you actually move you have to go sit with that team now is not just I changed its that you are now physically part of that team you will be in that team room. So it's really cool experience their tooth. The cross-pollination is fantastic. Let's talk about a Sprint length as a certified scrum master. I used to go off and help people figure out. How long should your Sprint be

do not do a three-week Sprint because you heard Microsoft is doing three weeks Prince. That's not the right answer. You should have you ate your internal and external dependencies and determine what is the proper sprinting? I wrote software for a stock trading company are Sprint lease with one week because you can't make a four-week bed in the stock market stock market look good last week. I woke up at today almost started crying, right? You can't make a bet for weeks in advance. So we had to have a really short will spradling so that we can react external dependencies. I also work for a

local company wants our stakeholders in park position, right? So for them to come to a retrospective ours are a spy review. They had to stop building take time out of their daily schedule come over to our office so they can watch a show software that every week right, but we can get him to do it once every 4 weeks. So that kind of dictated what I was going to be for the 3 week so I asked Eric bjorke was there like so how did you get to three weeks? I know the exercises I go through its customers. But what was the exercise here? And he started chuckling it said we call it the Goldilocks

syndrome as well. We tried two weeks and it felt too small. We tried for weeks and it felt too big and we got the three weeks. It felt. Did you notice to write a grid I get when we've been threatened there ever since because if you do a job correctly, there's a lot of rituals is a lot of ceremony daily stand-ups retrospective Sprint reviews planning meeting. Sometimes you break those panties at the two parts in an 1 week 2 weeks. Even you feel like you're in more meetings and you are actually developing. 4 week on the other hand feels like I can't estimate effectively than what I

can do for weeks from today is going to be drastically different. I don't know week to week is going to change what I could get done for weeks from when you asked me why I didn't know a lot of us try to force 4 weeks printing to a calendar month but never fit right so you're constantly trying to struggle is in 3 weeks breaks all that the network to the ritual and never forces us to try to think in month-long spread overlapped those darker areas are the automated deployment of azure devops on top of itself while we're still sprinting so we don't we don't stop and then deploy and then

start sprinting again. We Sprint Sprint Sprint Sprint non-stop pluses and minuses and pluses that you get into a muscle memory, right? You just get into a rhythm you start firing on all cylinders, but you got to remember to take time to celebrate. You got to let your people relax. You let them run Winans town. He's been doing a good job because it feels like it's a death march sometime if you don't take time to say good job in that building 18 you walk in there and there's just food and ice cream and cake. Everywhere and it's just time for us to go ahead and just celebrate that we've been

doing a good job. So do not forget to celebrate. So let's break this down looks like we finalize our Sprint planning the first two days we spent for 3 weeks 3 weeks from then we start a deployment and the deployment goes for approximately 10 days. What we do is we take two or more individuals on the team and we make them are what we call our Shield team or are SWAT team. Their capacity is taken out and they babysit that deployment for the next 10 days. What you don't do is Task your entire Team every individual to their highest health and then still be trying to fix production issue that's

unplanned work that you cannot plan for you don't know how much it is. So you need people that have to pass it to go fix those problems for you and we designate those as an appointment going out. We all emerge out of Master. So we have 500 developers 50 feature team 10 to 12 people merging in the master every single day. We have no Long Live branches. If you're new to a dollar scrum a lot of people do that. Go beyond the future is going to take three Sprint feature one and it's going to live over here by itself and isolation for 3 weeks while master keeps changing and then three Play some

time together and you get what we call a merge bomb. It just exploded cuz the number of changes our way way too big to take longer to merge Dakota did to write the code. So what we do at Microsoft use feature Flags would allow developers to merge back into Master every single day. So they're branch and master always in sync. So 3 weeks from now when it's time to ship their code is already sitting inside of Master. There is no big merge conflict on the flag when seduction production using feature Flags also allow you to separate the appointment from releasing if you don't use feature flags, as

soon as you copy those files to that server that features released, right cuz there's no way to force people or hide it from those individuals, but when you have feature Flags in place, which is expensive and if statement that says either one is colder don't from an external entity. I can actually ship the code without releasing the code so I can actually ship a lot of code and then turn it on for individuals and turned off or individuals, but that was give us the power and also allows us to mitigate really quickly, which I'll talk about a little bit later as well. At the end of our Sprint

what we need to do is to send out what we call a Sprint mail. I'm going to show you the new way and the old way what's interesting is that we've asked you change the way that we even do this internally at Microsoft what you really need to send me an email to beginning of the Sprint saying this is what we intend to do for the next 3 weeks at the end. I'd get another 50 email that said this is what we were able to do side of it would be a video that you can watch that would show you that code working and if I had stayed and Acuvue your Sprint review was really nice. So now you have 15 and

stay in sync email said they would get from time to time 50 emails is a lot of emails to get a hundred emails because they start to make sure that you don't just go crazy with it and what it used to look like is this this is just a workout in curry out of our boards feature. I can click on these links and see all the details. Another video in an email with a video in it videos really cool. We did it because I simply could not sit in 50 Sprint reviews because of the time zone issues Norwood. I probably want to sit in 52 much

to put a video in there of what why would have seen had I been in your spare review. This actually had a really cool side effect that we did not expect you cannot use Photoshop or after effects or any trickery in that video the software have to work which means the Developers for me to be able to put that out the last day have to be done before the last day of the Sprint cuz if you're new to scrum that last day could be pretty frantic day. The coach Hearn is a normal. Thank you. Try to quit all that work done that you've been getting half done but now to get in the video it has to be

done so that we can actually recorded in the video. It's also gives your program manager an opportunity to review the coach before they give it out to the public to make sure that it looks like they expected it to look up a couple days ago and polish it and maybe record a portion of the video, but the video hacks he has to be real and they had me. 14 Xanax to get things done is really powerful as well benefit sending us an email took it up a level to their lead. So for example workout time tracking has five teens underneath so we took those five emails and you get this one email instead Arab

York actually runs all the work on tracking we are now tracking the actual epic not the actual work underneath the Epic individually. So now I get a far fewer number of emails. I get a much more concise and easy way to analyze with that entire work out of attacking team working on and then I still have the video at the end that shows all the value that was at it but this is just an example of how we're still trying to figure out ways that we can be more efficient. We don't just assume that we got it right and keep doing that every time we can see if we can be more efficient we do so and

changing the way that we actually send out our Epix emails has been a great way to kind of speed us up a little bit. How do we do planning? We used to plans another one when we recently changed we used to play in 18 months in advance. But we realize that we are never ever getting to the stuff that we thought we wanted 18 months from now. So what we did is only doing 12 months in advance. This is planning and strategy and we do not hold ourselves to this twelvemonth. So if I were to save January 1st, where do I want to be January 1st a year from now if I've only done 60% of that a year from now

that's still a success because the world changes our competition changes the landscape that are working in changes to what I need a year from now might not be the same 5 weeks from now to months from now can be drastically different what I think I need so we're constantly re-evaluate what that 12-month strategy is comparing it to what the industry is saying today that we're not stuck in the mud sick note. That's what we said we're going to do we're going to do it. No, we're going to evaluate it and make sure that what we think we need to do is still accurate is still relevant in the

landscape that we're working in today. We do break it down into two 6-month semesters Bill semesters land at Big conferences. Historically was Microsoft build which is a very popular Microsoft conference. And the other one is Microsoft Kinect thereabouts. Months apart in the teams are like what do we want on now? What big things do we want Scott Guthrie to get on stage and say this is what we're announcing today and we would kind of guy towards that to get what are big giant item that we want and then we basically do 3 weeks Brands and then every for Sprint's we come together and say,

how are all the rest of the teams doing again? I got to sit in on Aaron bjork's team doing one of these quarter meetings is so we can all of his leads together in the lead sit in a room and have pizza and they talk about what they did for the last and what they going to do for the next quarter. And what was interesting as I heard one of the team say well one of my Engineers is going to go off and produces widget is what does code allow our customers to do X Y and Z. It's really going to have a cool experience for blah and another I have an interview that wrote something something just

like that cuz we needed it to do blah blah blah and I think it's going to fit your needs as well. Why don't you have your engineer talk to my engineer and make sure that we don't need to control had that meeting that happened. We'd have to controls there almost exactly the same. We're probably in spite different usability issues that cause the user to be confused and why there's two that look like they should be the same but aren't exactly the same. If you don't have the team taxi come together and talk about what their future plans are and what they've already done. So you can ask Siri use

that code amazing meeting for that reason alone, but alone making sure that everybody on the same team is driving in the same direction, right? So I have a lot of cool benefits are so highly encourage you to make sure that people take that time to sync up with other people that are related to as well. Remember I said aligned autonomy. They have a complete autonomy here as a team to make sure that they deliver on this print goals and make sure that their quarters are right and then we have our leadership basic the aligning everyone on what our strategy in our masters are going to look like.

A song about quality real quick historically we were like everyone else we had a code freeze code freeze is a polite way of saying please stop typing. Cuz if you keep typing you keep writing bugs and we need to count the bugs. You already have to please go have a pizza party go do something else just go get distracted, but do not keep Riding coat with all this coat over the QA team and the QA team works as hard as they can and then you get all that technical death back. That's how we worked as well the problem with working like this is that there's no consistency and how much technical debt or

how many bugs that you got it. It's really hard to plan when you're going to be released when your charts look like this you might have thought. Hey, we're doing pretty good. These two weren't too bad. Then all the sudden you have this huge Pecan Valley here, like what the heck just happened. Maybe try the new technology for the first time. The reason you don't know what happened cuz the developers aren't testing anyting they're not writing unit test and not the writing manual tasks are not even thinking about testing. They're getting code complete they're doing is untested code over the wall

in the q18 coming back this huge mountain of technical debt to this is enough. This is ridiculous. I always want to be one week away from Shipping always and he did some math instead of historically are Engineers can clear one bug a day per week or one bug a day. So if I do five bucks for engineer, I should be able to get shipping in a week, right if I know all my day and none of my Engineers turn one and five bucks at a time. I can say on Monday. I want to see if on Friday everyone stopped at the doing it only fixes bugs. We should be bug free by Friday right and

ready to ship that was his logic from this to something like this to know how many bugs I have. I have to be testing it earlier. So I had to start doing immigration testing early. I just start doing unit testing earlier. I had to have better automation test. When we're still doing automation test. We had to test more frequently earlier and better than we were before and we tracked is very close. Remember I said measure everything we have something called a bug bar and what we do is we measure on average how many bugs are teams are actually carrying throughout the

previous print. This is an opportunity for us to inspect and adapt our process. Now, I cannot stress this enough. You cannot punish your team for these numbers. how to say that again You cannot punish your team for these numbers. Sometimes I show people this and you can see their eyes light up when they start wiggling their fingers like this. Like they're going to cold Crush all the developers and make them sad like that. You're never going to get the number you want cuz Engineers are smart people if I get in trouble because that number 6.5 that number will never be 6.5

again. Even if it's supposed to be right cuz I'm an engineer and I'll make this number exactly what you want this number to be but this is a number where we can actually learn from the fact that is higher than opportunity for me to inspect how my team is working an adapter process of the bacon stay below that this is not a punishment number as a manager at Microsoft and I am a manager at Microsoft if my team is failing to meet a bug box. It's a failure on me cuz I didn't provide them what they needed to be successful know. What I need to do now is going to figure out what is it that

you needed. Did you need more time? Did you need more training? Do we need more capacity? What do we need so that you can say below this number cuz this is extremely important for us so that we can ship when we're supposed to ship. But if you punish them with this these numbers will be useless. They're going to be suspicious Lilo like this .73 I don't even know if anyone with codeine that Sprint, right? That's a really good number as a suspiciously no number in my opinion that I have to go have a conversation with you as well. But maybe they are that good if we do an engineering

scorecard Azure devops runs 24 x 7 Seven days a week 365 days year, there's never an opportunity for it to be down. They can't even be taken down to be upgraded right when we upgrade using it. And the only thing that you should ever notice is something that we took three milliseconds the first time might take point six seconds this time. Like that's the only it should still work we had to do a lot of architecture application while it's being used be upgraded without ever dropping any of the package without making sure that our users were

negatively impacted Pinellas. I stands for LSI means that the code is not performing the way it's supposed to be performing and we now have a life site incident we measure all of this stuff, but what's really interesting about the engineering scorecard. Remember we don't have Dev and QA an option. We have Engineers now everyone owns this number. I remember get back when I worked at Compaq. I'll write all these cool bugs and then probably be out in production. I sleep through the night like a baby but someone was fighting those bugs, right they were getting

service calls and some poor person is on the phone trying to figure out why the stuff didn't work the way it was right who broke the code to make them fix the code code get good real fast. That's exactly what we do at Microsoft. We actually have people that are one rotation that are the engineers that wrote the software that every time we have one of these gets on that bridge and start troubleshooting why our system isn't working anymore. The people were

making the bad decision now pay the consequences for those bad decisions and it has changed drastically the way that we do our job. It's imperative that you have. What's the Mantra that we use if you wrote it you run it. Right no longer. Do you throw this over the wall to someone else to deal with you are now responsible for luckily you rotates is not the same person all the time. That would be a miserable miserable job. But what we do is we have dri's it say, what does that stand for designated responsible individual that is on call and now on call for about a week and then takes about

two and a half months three months before comes back around to them. So it's not just this Death March. All I do is stay on call cuz it affects your personal life. You can't go to your son's soccer game if you want to call because you have to be able to get to the machine and start troubleshooting this within 5 minutes about the detecting that we have a life side incident. Then you have to mitigate that problem and that's what these numbers are doing here telling us how long it took us to detect how long it took us to mitigate. And then after we figure out how to mitigate it we have to go do

a root cause analysis we didn't have to go ahead and put in our product backlog the long-term fix that make sure this never happens to us again, right? So this is a process that we go through every single time and we are completely transparent you can go out right now. Search and look for a devops LSI and you will get hit of all the reports that we written and given to a public saying this is what happened is how we're going to promise it never happens to you again, one of our data centers in San San Antonio got struck by lightning took out all the air conditioners right now in the

entire freaking data center knows all about it. We have this website that tells you if we're having a life sentence in or not was in that data center in only that Datacenter so we can even tell you that we got struck by lightning how embarrassing is that that's not replicated across multiple regions as you can imagine we learn from that so that if one gets taken down we can still tell you that why I got taken down instantly which we couldn't tell you before but we learn from all this stuff again, you turn the red ones yellow in the yellow and green but you do not punish your team for these

numbers. So this is really quickly what we went from we went from having Milestones every 4 to 6 months to shipping every 3 weeks. We went from thinking of our code is horizontal and thinking of it as vertical and that's a people problem. That is not a techno. Any problem if you have many dbas, do we have any dbas in the room? Kind of okay. I'll pick on them in. Deviates historically are the hardest for me to convince that we need to cut our application vertically instead of horizontally. I remember wanting to let me know. I'm in I need to go off and forgot every join

every store procedure every Custard in next non-clustered index. I need to know everything about the scheme of this database before we write a single line of code is ridiculous to do that and go do this entire database schema forest for the entire solution is if you know it but you got to make me one promise it what's that? You can never change it, that's ridiculous going to have to change that. He's going to change it. Anyway to want to give you three months to go off and do it. Right. What I want to do is I'm going to show value

to our customers insulate. I want to show them that I can log into a website and go to their homepage. How many table table in a database do I need for that one? How many carbs do I need to a username and password? You mean to tell me you're going to spend three months designing the entire database? It said to give me one table with two columns in it. Go do that for me right now. Let's go show our customers some progress in the next Brent. We're going to go change it again and the technology Finally reached through our databases can now be food is the code in the application that

we write matter fact. I'm going to be speaking in a webinar with red gate tomorrow on the state of database Deborah about that exact thing because we have the technology now where we can see every part of your application the database the schema, even the infrastructure as code and move it is fluidly as we do everything else in your pipeline. So do not cut your application for the you got to start cutting your team's vertically instead again teen rooms are in there, which is extremely important. We make sure that our teams went from these ginormous 20 person team down to 8 to 12

person teams, right that a lot of daily stand-up be more efficient. Our teams to be more efficient and very very quickly as an organization. So this is just a bullet list of all the kind of things that we've learned so far. We're going to continue to keep learning on to make sure that we move as fast as we can. These are things that you can try but again do not take this as we need to go do all the things that the Azure devops team did but just learned that we are constantly trying to improve and you should be doing the same thing as well. So quickly what I want to do before I leave here

in the next minute or so. Let's talk about people who going to help you do that. This is my team. Now we got nickname The League of Extraordinary Cloud the quickest way to get a hold of us. Is that hashtag. So if you're not on Twitter, but you have a question open up a Twitter account use that has taken a question and then entire team will come running to that question for you. We will all either read it will go back in and try to find an answer for you one of the best developers you ever going to meet your life lives here in Seattle. This is my devops conscience. He lives in in

Australia. This guy comes from Chef in Powershell. He's my windows and she's my kubernetes Linux open-source Guru who lives in California and if you treated us, we will come running a sort of like a like a bat signal almost right? If you asking you that hashtag there will literally just have to see it and then come running which is really really awesome. I'm in another thing that I want to talk about really quick and I text you should have low too much quicker than it did but it didn't hear. Oh, she's awesome. She's got all this crazy CrossFit stuff when I met her she was doing a handstand

push-ups on stage. She's one of the best for everyone to see on stage, which is really cool to see if you ever get a chance to see her speak in Hong Kong right now on tour speaking in a tour there, but she's fantastic. So is actually all of them are really good speakers, but she's in really going around for sure fantastic person. So what am I doing now? Cuz we are acting at time. I think we're almost exactly at an hour. I am going to spend the next 30 minutes at that Q&A so you can come over here and ask me whatever questions that you want about how y'all

Microsoft inside a box and all that kind of stuff. So the last thing I want to do is say, thank you so much for having me, and I hope you have a great event. Thank you.

Cackle comments for the website

Buy this talk

Access to the talk “DevOps at Microsoft - "Enterprise transformation (and you can too)" ‐ Donovan Brown”
Available
In cart
Free
Free
Free
Free
Free
Free

Access to all the recordings of the event

Get access to all videos “CMG’s international IMPACT Digital Transformation Conference”
Available
In cart
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Software development”?

You might be interested in videos from this event

September 28, 2018
Moscow
16
159
app store, apps, development, google play, mobile, soft

Similar talks

Brian Wong
Technology Fellow at Capital One
Available
In cart
Free
Free
Free
Free
Free
Free
Jamie Baker
Director of Product Management at Syncsort
Available
In cart
Free
Free
Free
Free
Free
Free
John deVadoss
Head of NGD Seattle at NEO Global Development
Available
In cart
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “DevOps at Microsoft - "Enterprise transformation (and you can too)" ‐ Donovan Brown”
Available
In cart
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
558 conferences
22053 speakers
8194 hours of content