Events Add an event Speakers Talks Collections
 
MLconf Online 2020
November 6, 2020, Online
MLconf Online 2020
Request Q&A
MLconf Online 2020
From the conference
MLconf Online 2020
Request Q&A
Video
The COVID Scenario Pipeline: High Stakes Data Science
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
161
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About the talk

In March of 2020, DJ Patil assembled a team of data scientists and software engineers from Silicon Valley to assist with the data and modeling efforts for the State of California's response to the COVID-19 pandemic. I was part of this group and worked for three months with a team of epidemiologists at Johns Hopkins and solutions architects from AWS to run large-scale forecasting models of infections and hospitalizations to aid decision makers at both the state and federal level with disaster response planning. I'll share the story of kicking off the project during the critical 48 hours before California shut down, the work to operationalize and scale up our scenario generator to run as fast as possible on a single node, and the massively parallel, MCMC-enabled forecasting engine we built that ran on hundreds of machines and launched just two weeks before the second wave of infections began during the summer.

About speaker

Josh Wills
Developer Without Affiliation

osh Wills is a Software Engineer at Slack. Previously, Josh was the Head of Data Engineering at Slack. Prior to Slack, he built and led data science teams at Cloudera and Google. He is the founder of the Apache Crunch project, co-authored an O’Reilly book on advanced analytics with Apache Spark, and wrote a popular tweet about data scientists.

View the profile
Share

Everyone. Next step. I am excited to introduce Josh, Wells, a friend of mine, and a long time and milk off presenter. Josh is presented for for us from multiple companies over his career and now he's an unaffiliated developer and data science. I'm excited that he's going to share this with us today and please help me. Welcome Josh. Rival to the forks. Hello, can people hear me? Is working. I don't really know much about computers,. Sure. I'm doing this,

right. Not getting, I don't actually know anything about computers. Sorry. Courtney says, yes, but I think we're good to go. Thanks everybody. So yeah, let's see. My name is Josh Wills. I am I I refer to myself as a developer without affiliation because I don't have a job right now and have not had an official job for the last year. I used to work at slack engineering and search up there, or what else. Are you having a good conversation to get slack on date engineering and searching

for structure. I now he's working at caught doing data science, these sorts of things and I used to work in Google doing all sorts of things to. What what I want to talk about. Today is some work that I did over the course of the year with a group of folks that I'm a very diversity besides at the Department of Public Health in California. I was a team that beat me all just at Johns Hopkins on a bunch of folks from AWS. The early stages of the covid-19 pandemic in California. So this chart from Gavin newsom's Twitter account. This is a

chart which shows in the early projections were supposed to be called them planning, scenarios around covid-19, California back in April, the steam generated on demand very very quickly under very stressful, circumstances. Back in March and April May and June a good chunk of this year. I want to talk a little I guess. I did a few things I want to say before I get in the rest of the talk. The first thing is I am not an epidemiologist and although I've been working in the tech industry for like

20 years. Now. I feel like I'm just kind of maybe barely starting to be sort of okay at software, engineering expert and epidemiology. So anything I say that sounds Related epidemiology. Y'all should probably take like a grain of salt or double check with like actual epidemiologists. So I have no kind of pretense to be an expert in this stuff. Second thing is, you don't like everybody else. I have a lot of you, no personal feelings about, like, lockdown and public health responses and how the government is behaved their complicated. I'm not really talk about them today. They're not

really relevant to them. So they kind of this stuff I want to talk about, but I do want to talk about though is my experiences and what I learned about building like very data and computer intensive work clothes in the service of Science Under the most stressful conditions. I can imagine because I thought that work would be useful to everybody who do does machine learning who does data science, who does any of this kind of stuff. I think the experience in there, I think will resonate with y'all. When you think about your own experiences in your career and and all that sort of stuff. So

the work I'm talking about here, started the week of March 14th, 2020, which I think there was I saw a joker by earlier. There were only really two days in 2020 like March 14th and November 3rd on. It. Looks like maybe November 3rd is coming to an end of which I'm grateful for. Ideally, it will end while I'm giving this talk, but that's sort of roughly when this started and the work on the talk about ran through roughly June of 2020 now, so In early March like everybody else. I was, you know, panicking and kind of freaking out and I got a call, one day was a

Saturday actually from a guy named DJ Patel, who works at Kum & Go devoted Health, but he used to be the deputy chair of the United States used to be. The chief data science is like, then I think a lot of people know, DJ DJ's been around for a while. And DJ have been tracking, the covid-19 back in and then I can sign up for quite a while and was going to Sacramento that day to start helping out with the state of California's response to the Atlantic. Has he called me to say That he thought he might need

some software, engineering help some data, engineering help and was I available to help out on cuz he knew that I was really doing much of anything else other than obviously stocking up on toilet paper and pasta and you know, basically I said hell, yes, like anything you need. Let me know. So I guess you take us up to Sacramento. Monday rolls around and he starts working with the other folks, to Department of Public Health. In at the time. The department is working with the Infectious Disease, Dynamics group, at Johns Hopkins, who are running models were called planning scenario

for the project in California, in every County of California. How many cases of of covid-19? Are we going to have, how many people are going to be in the hospital? All that sort of good stuff? Know you have any yellow just have been working on this problem for? I think the police about 4 weeks at this point. This is Kylie mid-march. They've been working on this for a while and they're not really like sleeping that much because Doing this for a whole bunch of states and their running all of their models on a single, very large machine, that they have actually at Johns Hopkins. And so what,

What DJ and my friend, Sam Shaw and I was asked to help out with. I was basically providing like really fishing pick up some of this model Runs. Run them on much bigger machines in the cloud. But if it is what the other than me, I'll just take a nap. Every once in awhile to do that was like at like I just said, I'm not an epidemiologist. Unfortunately. I have to talk about epidemiology little bit for this work to make sense the model we were running is what it's called. An s e i r model just called in epidemiology a compartmental model.

The idea is that you have a population of people on each store, a member of the population belongs to a different compartments. There's the S compartment, which is this acceptable group of people who are potentially could get the virus but haven't actually called yet. He is the expose group on people who have been exposed to the virus. And some way I is the Infectious group of people who are now capable of giving the virus to others are, is the recovered group of people who are recovered from the virus. You can no longer physically, infect other people. I'm within the Infectious group,

you're going to have people who get infected and need to go to the hospital. I need to go to the, I see you need a ventilator. And obviously you're going to have some people who died in the idea of these models, is that there are all of these transition, probabilities are framers of the Scribe like transition probabilities between these different compartments. And so would it be be? I'll just do is try to figure out estimate from the day. They have what these parameters are and then they can run these models forward in time to get a rough idea of how many people are going to be

different compartments overtime. So usually you see these models, it's kind of like big like exponential curve because that's like one and only with the underlying Dynamics are the reason that the stuff is computational intensive. The reason the stuff require software Engineers really know what these parameters are. We have guesses about what they are based on the data, but the reality is that we run thousands and thousands and thousands of simulations by doing random drawers from the front or distributions. In order to projecting forward, so whenever you see when these models up at, like

the model hubs until I got, this is the one from 5:38 in the Johns Hopkins, University model. This is the model that I worked on over here. There's always a range of possible outcomes. So it's not simply a question of like, you know, the mean is not actually their usual predictor here with the median and Elizabeth actually much better estimate. It will actually happen, but the possibility of having these out-of-control growths of infections and all that kind of stuff because of the exponential nature of the distribution is the thing that tends to push these means up. So it's important to

know that like, we're always doing thousands and thousands of simulations to account for the uncertainty around the parameters in this model. The other thing we really important about this model of particular, at least, Ross in California for the, for the government. I'm was it, it was a, a Geo spatially aware model. If you think about Lake responding to a pandemic, how many people are going to get sick in the state of California as a whole. And what you really want to know is like a sort of county-by-county level. How many people are going to be sick. How many

resources, are we going to need? So you can be sure to get your ventilator or get your hospital beds. Whatever it is. You have to the places where people are most likely to get sick. So, the compartmental models we're working with had a geospatial component to them. Each County was kind of like a little node. And then we had transitioned probabilities, were people moving across County's, based on historical eating patterns and stuff like that from census data. So that was sort of the other very cool aspect of this model. And then the final thing that we really had to kind of like again, the

epidemiology have to do like a lot of guesswork around, is understanding the impact of non-pharmaceutical interventions. So, how are people going to respond to the model? Is this kind of their how people going to respond pandemic? The funny thing about these plants near you why you can't really call them forecast is because showing kind of the planning scenario to people will change their behavior. So like the planets near the other shows like an exponential growth in cases is only going to be true. If no one actually believes that. There's going to be an exponential growth in cases,

that makes sense. If I show you this thing and it's like this huge explosion cases. Everyone will change their behavior in a way that mean There won't be an exponential growth in cases. The funny thing about these, you know, I guess the funny thing is that thing or maybe good thing. I suppose depending on your perspective is the only really get like Global pandemics about once a century. And so a lot of our Lake, our our training data are labeled data around how people behave in response to pandemics is from like 1918 from like the great, the great influenza pandemic and so data we

have on my house that he's like, Philadelphia St. Louis and Kansas City responded to the pandemic back then, or actually like very useful in informing and understanding how people will respond to things now. And the data were generating. Now, from the current covid-19 pandemic will be, I would have to. I Can Only Imagine incredible useful in predicting the course of like the next pandemic. We have over at some point in the next century. So that's one of all the factors to consider these transition probabilities, the geospatial component of the spread of the virus and the impact these

non-pharmaceutical interventions. You would like the huge kind of parameter space. We're trying to understand it sample from when we run the simulations to get an idea of like what's likely to happen. We're okay projects as a sort of nominally senior person. There's a great book, which I highly recommend called the first 90 days. And it's a book about joining joining a new company or new organization as a leader. And the things you need to do to be successful over the first 90 days to do yourself up for success.

In this context, I didn't have 90 days. We didn't have 90 days mean about 9 hours, where the basically, like, One Day More Les to establish like credibility with the technical team at Johns Hopkins, that we could be useful in such a weird thing. I didn't one of the virtues of being like a senior software engineer is it. I have joined a wife and I had to come up to speed like a bunch of different times. And so it's like the skill. I've got good at in some ways is like coming up to speed on a new project relatively quickly. And so

this is especially like a weird area where like we were brought in, you know by the State Department of Public Health. This is not something that the epidemiology of Hopkins like they didn't like us to their team or something like that. We would have dropped on them to help we promise they won't like waste all of your time. It's very much, like felt a strong sense of being clear early on about like, what I actually was capable of doing what I do, how to do. But I did not know how to do things very clearly. What I thought was necessary to do like that kind of stuff and also being very

clear about like the resources that I was capable of bringing to Bear to help to resolve the problem, right? With a tremendous degree of humility, right? Again, not an epidemiologist have never run this stuff before. I know a lot about running to the pipeline's, know a lot about data and computer intensive like workflows that this is what I know about and I'm happy to help with this kind of stuff that was like for the immediate challenge of the project for like I want to call I guess like a bit of good news, which is the fact that like this was even possible at all for us to

join this project. And how about was it was made possible by like a common set of tools which is that like the s e i r a Was done primarily in Python using like, pandas and number for, like, the very compute-intensive aspects of the simulation in the hospitalization resource analysis was done in our like, primarily using like Hadley Wickham tidyverse for all intensive purposes. The code for doing this with all available on GitHub. It was like share publicly. You can look it up right now is actually open source. You can go read everything you wrote. So I really was really kind of a

remarkable. And I guess like I feel very grateful for having these tools as common lingua Franca for government proposed an industry of affection Academia, like, we speak the same language. We share the same tools and this allows us to come up to speed and kind of like a line very quickly on things that otherwise would have been much more challenging. Like if we had had to negotiate a software license or something, to use the tooling the epidemiologist reason or if we had had to like, have a discussion about, why store Important like, all these things would have really slowed us down.

If we couldn't get to work nearly as quickly as we did was like, very, very helpful for us. Or media project in the first serve. 48 Hours was essentially like a lift and shift up, lift and shift operation. Meaning. There was this machine at Johns Hopkins, that was running. All these simulations. It was development machine. It was very big and beefy. But there was no kind of clarity around, like, what were the dependencies that the python code of the r code needed in order to run using lots and lots of packages? Lots of our package, has lots of python

modules, that was documented. And so, our very first job was just like, figuring out what can what exactly do we need in order to run this code. So that was like most of our first day it was like figuring out the dependencies, getting everything from the machine, putting it into a Docker file and then building like a container image that we could use repeatedly all of the place. It was no. Like we didn't bother changing any of the optimization, you know, s e. I r code itself. That wasn't important. It was just a function of like, let's get a lot of clarity, very explicit around like what? And

let's create a mechanism for us to run this thing on any sort of machine we need. And so that was that was since we like our very first effort. That was our entire Focus. I got to call it a special note of thanks briefly the kind of two other things we were doing. During this time. We're really bringing resources to bear on the problem. Obviously step one in creating a new project is in selecting for it. And so we reached out to the black to help set us up with like, you know, basically a free professional account, so that we could use share Channels with supposed to pop in the folks in

California. All, so we can all work together in one works, babe. And then simultaneously, we spent a lot of time calling friends at ews to get more resources, brought to bear on problem to get more VC. Do you use to get more machines to get like account limits, lifted? And I just incredibly grateful to the first so I can either really, really came through for us. When we were just like a couple of rando is basically saying, hey, we're helping out with this and we need some help. They were they were really there were there for us and I'm incredibly grateful to

them. What's the state? What's the governor? Made the decision on March 19th to, like, lock down the state. Sings got things going to get like a lot less hairy. We were to the point where we could run things in the cloud, can a consistently reproduce the models. We are running on this Hopkins machine, epidemiologist could rest a little, they could get some rest. We could run stuff for them on. This was great. Our next big challenge was really just like the nuts and bolts boring standard fundamental, but also, like kind of

great and magical work, like profiling the code, finding the bottlenecks fixing them moving hard-coded parameters into configuration files, like really just like making the code much easier to run, making it much easier to be reproduced, adding tooling support reproducibility, all that kind of good stuff, like the really Like, you know, through the yeoman's work of being a software engineer more, or less was was like, how we spend most of April slowly but surely shipping down the performance so that we could run models in. Like, a matter of minutes that

I'm a matter of hours, all that kind of good stuff on. It was great for me about this so I can just say I guess like I you know after leaving slack before working on this project really is not like open the computer to write code again in like 4 months. I was super broken down and tired and I love this experience because I got to ReDiscover how much I enjoy this process just for itself. Like, I just like doing the stuff. I like making things go faster. I like fixing bugs. I just enjoy doing it and apparently, I would do it for free because no one was like, obviously paying you to do this.

Work is all volunteer stuff. So, yeah, I don't know, it's hard to talk about the stuff cuz it's so boring in common, but it's also just like so important and Fills me with so much joy to do. So anyway. One of the things we were doing the same and I were running a lot of models of all these runs sort of, basically by hand. We would spend a whole bunch of machines and then each machine would execute a set of commands to kick off the python code to kick off the r code in kind of stitch everything together for reproducibility.

I really feel very strongly about reproducible computation. And so I went and found a tool. I was some of y'all may know call DVC beta version control for making kind of old for creating reproducibility and Clarity around the actual command. We were running for doing like reproducible research so we can basically show the epidemiologists. You know, where any given run. This is the code we used. This is the day that we fit into it. This is the exact, some of the commands we ran and it's all like nicely version, controlled all nicely checked in to get home and all easy to

reproduce. And so really, really grateful to fix the BBC for building, like, just such a nice tool for me to get started with two, till I be able to do some more research, you know, kind of spur-of-the-moment on short notice. That was incredibly helpful. Why we were doing this work. We also had a team of of solutions are catching a w s who are working on automating. Some of the kind of man. You will work the same and I've been doing my hand like one of these models on machines. Do you want everything on batch? And if y'all do not use it in

dispatch, really, really great school. Highly recommend recommend checking it out easy way to kind of like Container Store contain a rise in orchestrating massively parallel jobs across. Lots of Monster, Machines was incredibly helpful for us and one of the consequences of using it really got the other thing was excited about the idea of not just running these models on like a handful of machines, but one hundreds or thousands of machines. And so I'll just decided to try out a much more sophisticated model

where instead of generating kind of these parameters randomly and then running them forward every time we would do. Computationally intensive process of generating, a set of parameters, running the models forward evaluating the quality of like the predictions of the model. Again, actual data we had on infections in ICU patients, and all that kind of good stuff. Or, you know, again that stuff. Sorry. And then updating the parameters and we come up with a new printer model. It would do a better job of data using Like A variation on Metropolis Hastings, like Justin mcmc process is going to

do this. We had to run like much much, much more computation like about a hundred times that much computation is vagina before. Support this model. Cost-effective way on a SS. We have to take advantage of spot instances on him. Breaking up is very, very long running computer jobs into chunks or blocks of iterations. And you can essentially as like, a gradient descent process for all intensive purposes, right? You have a function, you're trying to optimize your running instead of iterations, you try and take positive steps forward with each

step running these things on Robinson says, which are unreliable. And you can't stick around for a few hours and hours of computation. You need to run. And so you break things up into little blocks. Are you checkpoint blocks as you go? So if you need to run like 300 are rationed nor to get convergence, you'll run, you know, 38 eration checkpoint things that 3-run, another 30 durations checkpoint things, that free donuts at 4, so you can restart with where you left off, if things get messed up, Okay,. Then we'll do it. When we're doing this

work. We had a hard time getting these models to converge at first. It was very confusing as to why, and it ended up being a very bad interaction, between the checkpoint in code that I roads and DVC. Unfortunately, in the initialize DVC for a new sort of step when these blocks and PVC with accidentally overwrite, the day that we had checked, when it over a third of our new starting points for things, just took a couple of days to figure out, and I just want to call a mistake, and that ice would have, prioritized, like the reproducibility of the runs over visibility into

making sure that we like, when I move this thing from code, that was running as a single noticed, has to like a checkpoint is block process. 100, the machines I didn't actually have a log in right away to verify that like it was actually doing what I thought it should. So, get another reminder from yet. Another experience, visibility is more important than reproducibility. Be sure. You have your visibility infrastructure and You're monitoring your logging in place before you start worrying about reproducibility quite as much as I did. What did I learn from all this? First and foremost, When

the Call Comes answer it, I love doing this work. This was the best working experience of my life. It was an absolute honor and privilege to be able to do this. If you ever have the chance to help out on this way. Please take it. It's just do it for selfish reasons. It's an amazing thing to do. It really is. Sharpen, your axe. All of the tooling, all the things I learned how to do to help out with separate was stuff. I learned like doing a documentation and analyzing marketing spend and like, profiling, performing on on like slack and stuff into do jobs and stuff. That was where I

learned how to do this stuff and it was an absolutely amazing feeling that like the work that we do could be so helpful to so many people, like in this kind of Crisis, sharpen, your axe hone your craft. And finally, Professor making tools for machine learning, make those axes easier to sharpen the tools. You create are incredibly useful for everyone for scientist, for researchers, for everybody. Yeah, make them easier to shop sharpen, think of me. Sleep deprived trying to get your monitoring, your ml monitoring tool up and running in a crisis

situation, like in nothing. In March, when you're thinking about the developer experience for your two line, make this stuff accessible, make it atomic mass. Easy to use cannabis as the stuff in the office, just the best. And finally, last but not least. Thank you. These people. This was the team of folks. I work with, for across epfl Hopkins, Utah, DWS, and random folks in Silicon Valley. Thank you all so much. It was an absolute privilege to work with you. And with that, if I have any family time for questions, I am happy to take them. Otherwise. Thank you so much. I really

appreciate it. When are so until the next talk, but I think for one question from Carlos, were you running? Yeah, the great question. I'm so we're running on the order of ten thousand different simulations using a combination of assumptions around how people would respond like with the empty eyes would be. We will do have a good idea of how deadly the disease was back in March. So we had to do a bunch of gases around, like, how many people get infected will be hospitalized? All that kind of stuff. I'm so jealous kind of 10000 runs 10000

simulations across. I think there was something on the order of like a hundred and fifty counties. We had a simulator 150, geospatial units in California. So I called it like a hundred fifty thousand,. Do my math right there. Yet under 50000, simulations or 1.5 million. For Josh regarding some modeling, but we're at a time or it was my pleasure. Thank you.

Cackle comments for the website

Buy this talk

Access to the talk “The COVID Scenario Pipeline: High Stakes Data Science”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “MLconf Online 2020”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Artificial Intelligence and Machine Learning”?

You might be interested in videos from this event

February 4 - 5, 2021
Online
26
106
ai, application, bot, chatbot, conversation, data, design, healthcare, ml

Similar talks

Yindalon Aphinyanaphongs
Physician Scientist at NYU Langone Medical Center
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Sophie Watson
Senior Data Scientist at Red Hat
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
John Whaley
Founder at UnifyID
+ 3 speakers
Christoforos Kachris
Founder and CEO at InAccel
+ 3 speakers
Anmol Suag
Data Scientist at Blueshift
+ 3 speakers
Martin Isaksson
CEO at PerceptiLabs
+ 3 speakers
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video
Access to the talk “The COVID Scenario Pipeline: High Stakes Data Science”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
949 conferences
37757 speakers
14408 hours of content