About the talk
MLflow is an open-source platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). In this talk, we'll discuss MLflow's components and run through a quick demo.
Ben Sadeghi is a Partner Solutions Architect at Databricks, covering Asia Pacific and Japan, focusing on Microsoft and its partner ecosystem. Having spent several years with Microsoft as a Big Data & Advanced Analytics Technology Specialist, he has helped various companies and partners implement cloud-based, data-driven, machine learning solutions on the Azure platform. Prior to Databricks and Microsoft, Ben was engaged as a data scientist with Hadoop/Spark distributor MapR Technologies (APAC), developed internal and external data products at Wego, a travel meta-search site, and worked in the Internet of Things domain at Jawbone, where he implemented analytics and predictive applications for the UP Band physical activity monitor. Before moving to the private sector, Ben contributed to several NASA and JAXA space missions. Ben is an active member of the open-source Julia language community. He holds an M.Sc. in computational physics, with an astrophysics emphasis.View the profile
I am cooperating with you guys. Next door speaker is if Ben Saturday, and he also has a Solutions architect. And then, and now he's working. It. They took breaks. And then he'll talk about the lights. Like a valve ML on day. I, every time. Thank you Dan. Hello everyone. Play to be here, so I'll get straight to it. This is a no flow. It's basically a machine-learning lifecycle management platform talk about why it's it exists. Why people like you invested so much time building it primarily because I'm L. Work specially development pipeline development
is complex right? For those all of you practitioners you would know that it's a and it's repetitive process. It really is a cycle, right? Yeah. The preparation phase, followed by model, training model valuation. Then on to to pointment and you constantly catching these metrics and seeing where they are your mulch drifting or not. And she have new Rod at 2 coming in. I hope that you're going back and doing good running through your whole training and deployment cycle. Oregon.
There are dozens of tools open source, tools out there for each of these various phases phases, right? I primarily, we talked about our and the python ecosystem. I'll talk about spark a bit more in a second but as a home, each of these steps themselves are It's written right to go through for the operation you Loop in there with new features, right? Same goes with personal training. You going to eat to write on a piece as well. So yeah, a lot of parameters to be tracked and that's that in itself. Is coming, can be challenged,
then you need to sort of scale. To know, not just how many servers, but large groups, right? I was talking about bringing in siloed teams. LT1 y121 Flow by self without preparation. You might have so that Engineers involved for the model training will be a data scientist for the deployments and the collection of the new raw data that might be the devops folks right? So this life cycle needs to scale out to all these various teams you want to impose some sort of governance.
Ideally right. This is rarely done but I'm a Flirt will will this thing in changing all this and is also the model exchange. So ability to say train model with one open source framework yet deploy, its using another one, right? Training tensorflow, deploring pytorch example, contempt. So that's that's the address and Emma. Flo is here to the rescue. So what is it? It's an open source project. It was started by databricks open source in June of 2018, It's a set
of conventions specifications Tulsi like libraries and Community currently all developments on GitHub and yeah, a lot of lot of different folks involved. It's already been an integrated in three or four commercial software. Us so quick. Quick design, philosophy. API. First, everything is an API. It's meant to be really Easy to, to set up a programmatically, right? It is about to automating this or life cycle, right? So, everything needs to be done programmatically if needed is modular or talk about its its various pieces in a second. But again, you can
take what you need. Discard the rest. I'm easy to use of all demo that in a second. And yeah, so right now, it's actually available from within pip within condo condo at Sunday on the source, Ford site and good good good. Oh, by the way, apis for Java Python and r. And yeah, this is open source in that it's a big problem. So we're trying to really get as many contributors to assist given their problems right to want to have this this tool sort of a dress. Everyone's needs. And hence we need more input, two more contributions from
others. Actual components door for one is brand new. I'm not going to talk about too much, that's the registry, but the major ones are the tracking project models to tracking is that you tracking all of everything. The code use for the desperation code used for the modeling, all the parameters used within the machine learning algorithms, the corresponding model performance results, attract any sort of environment configurations, what do not distract. Then you have the the Project's component which basically, bundles all that. All those articles
into into a package, which can be then redeployed and anywhere that said, you cannot go ahead and reproduce exact same results by. So projects, aims for retrieved ability speech, basically have to Open source components integrate with ML flow, namely M leap and Onyx onnx, which about basic converters, from one machine learning framework to another. So again, using Emily pronounce, you can go train and say tense of flow, but then deploy, that model, come over
tomorrow to buy deployed as a bike till tomorrow. So, Tracking your few concepts. Are you tracking parameters value pairs? Anything like actual Matrix is a performance metrics for the models. Artifacts could be just any Generic files, you've generated some image for the results of a model. You can bundle that in and of course the source code projects themselves. Again this is a bundling of the code to configuration and the data sets such that you can read Lee. Read
rerun reaction to the entire environment and reproduce the same result whether it's done remotely or on your local setup, yum. So here's a example of what a project would would contain. You always have some sort of se llamo, config file, your main, your your modeling script and all that you can actually just do animals for 100 find Main and it will go get what he needs from llamo. And set up your vitamins for you. And yeah, you up and running with that entire workflow reproduced. As I mentioned, you have so with the with
models that are just your tee time, leap and an dunks. The bill to text you convert for the more. You can have it set up such that you have the native model saying this example tensorflow, right? That saved as is and then you can have a converted one as I sit by the generic python function which can then be run by with by any python environment, save the doctor or a spark in the second deployment environments. Basically, anything you can imagine. So Java should be included, are my apologies but we often have I should play me some done on
dog containers. I'll go demonstrate this one. The second will do some Backstreet appointment using spark and there are even some other cloud services out there. Namely much stuff is yours machine, learning service and AWS sagemaker. Okay. so, Yeah. It's lightweight open platform. Integrates well with existing Frameworks. And it has its own store over by the way. Not running in the background. Keeping track of all these. I'll just log-in activity and within databricks. You have a managed rosian of ml flow available. Can I see if I'm a
Jumping to. Tell me what time Rite Aid. Close demo time. Sure. So here we are. In Sodus is azure databricks, managed, Apache, spark environment, available on Microsoft. Cloud is your, and I'm on these your side right now. I have a little spark plugs are going bad by little, I mean minimal has one worker. Yay, let's take a look at libraries. I do have two libraries installed mlflow, which way, I fight from Popeye and koalas, which I'll talk about tomorrow. If you guys are around, that's basically app and API for Apache spark. So
this library at my flow is installed on this spot cluster. And I'm going to jump into. A notebook. It's about a big snow book. If you're familiar with Jupiter or Zeppelin two should be very same thing as an HTML base and four years. Those practitioners are those who study the machine learning a bit you're probably familiar with the status at the Irish that has said some call the Vintage.. Because from the thirties and I really like it because it's be straight forward. So let's go to connect it to that cluster InfoCision. And this is a machine-learning.
I keep track of all these experiments, right, go ahead and load this Irish. So I'm going to read it in into spy costume, with the little bit renaming and ultimately just display the first first 10 results. So just do the first execution on the cluster takes away. Give it away, give a second will we can get started with putting together our pipeline space to getting getting the hard after prep for machine learning work. I hope it's not the internet here. Okay,
so as soon as I've got, that goes in the meantime will continue on to the next stage. Once we have the data set in memory, we can do a couple things. First off, you'll see that the one of the fields their label species is a string and we have to dress that because what will be using later on for for machine learning work, the spark machine learning learning library demands that all data be in numerical form at. So we'll go ahead and convert that is map. Those strings
to Avengers I got is going. Using the string in Dexter and there's one other stuff that needs to be done in Vector Sandler. They see taking all the features that we see and punching them up into one vector. And that's what they see. Will have eyes are prepped. Sorry, we go again. Dozer, you familiar with the IRS data sets. There should be, should be very, very old site but basically you have a force field and Fort for length fields of sepals and petals.
So if I'm not mistaken, the big ones or petals, a small one to Staples and back in the 30s was go ahead and make sure you make these length and width measurements on these petals and sepals and himself as a botanist / statistician. So, he could identify the species of these flowers. I need went ahead and constructed this. The status it. So, she has three species setosa virginica and versicolor. So we're going to build a model basically, an expert system, which is fed these lengths and wits
and we thought it was just predict that the species of of the the flower here. And that's we're doing. As I mention, we're going to do this the map ping. Our species to an integer. That's label. So these species have now been map to label. And I and our original features have been vectorized into that sector that, okay. And it's this label and features call him. So we're going to feed into our machine learning algorithm. So, very quick, run through the
process. I just want to talk about the act of splitting your house at into training test. That's just basically, to to make sure you have a sensible way of evaluating model performance. So, typically you turn you hand over the majority of your data sets for training purposes. And what's remains what's held out? What is used for testing in sports is a very simple function for that's called random splits. In this case, I'm going to give passed 2/3 of the data set over for training.
And remaining third, we will keep for testing. Good. Good. Good. Here comes the muffler fixed. Okay, I'm just going to put that in. I want to go in for another extension from it, that the spark pieces while we're there, we'll go to. I will pull in a couple things from Sparks machine, learning library, namely, Addison Street classifier and also a multi-class evaluator. Okay, I'm building this new little helper function here called train and evaluate and it's going to take you into model parameters for decision tree, Max beans, and Mac steps.
Okay, so here we go. I'll start off with an m o flow start. Everything following this within within. Sanitation is going to be locked by Emma flow. Okay, so I'm going to construct this for the century classifier. I'm past it these Max bins, max depth parameters, we were fed into my helper function. Correct. I'm going to then Train my motto. FedEx, that is returning data set data frame and out will pop this distant reclassify model. Do Little Prince. Okay?
And straight afterwards. We're going to use the same model to make predictions on the test. And that's the transform function here. You can think of is predict as you wouldn't say scikit-learn predictions. That is all. What are we thinking again? The species of the flower, given the length and width of the sepals and petals correct tomorrow, maybe directions. But we need to weigh to basically, Gage is performance, right? I'm going to use to separate metrics one being accuracy. The other one being F1 score so that's basically comparing
the actual species versus we predicted once different different techniques for measuring, how well a mouth performing and then comes out logging. Okay, I'm a flow. I'm going to log parameters. They only are they said it's just a key value pair. No, I'm going to log those. Those model parameters, Max. Beans, and Max death. I'm also going to log to metrics that. I'm generating. I Christina van score. No, I just want. I could have been logged any sort of other artifact that I would like, codes any images of
generated, but not doesn't matter. You can log all that. I'm also logging the actual model itself, In this case, it's this pipeline model and I'm just giving it a name. Okay, so that's my little helper function. And by the way, Justin returns tomorrow at the very end, okay? Now let's go ahead and put this guy. Let's use it. So here we go. I'm going to go to use the strainer evaluates. I'll pass it and Max bins value of 15, a max depth. And have it build a model for me. Okay.
Good good. Good, there we go. After 6.90 F1 score is 911. So pretty good model. Can we do better? I'm just going to say Go ahead with max depth of 3. Just just alter that one value, their perimeter tomorrow parameter. Yeah, much better 98%. Okay, I can continue this. For, I'm an altered, the max bins looks like that didn't make too much of a difference there, but you know what? I've already forgotten the first few scores, but hey, that's okay because that's what Emma flow is doing in the background tracking, all that activity. So, here are the two things within databricks. Y'all shovel
sidebar for mlflow actually keep track of the, the three runs I just ran. So I actually just one point at least you were pretty similar, but better yet, there's a whole you are Time give it a second again, disses. This will you would run in your eyes on the undersea Ally. It would be Emma. Flowspace you I and you get this this environment up and running and here you can actually come in and compare results. Right. If I want to say, it's a sort these guys by accuracy I
could do that. No, okay. Good so far. But I want to go a little crazy, so so far I've just done very simple. Changes of one parameter, typically, in real machine learning work, you might have dozens of parameters that you want to search through to find the the the best model available right now. So you wind up having a multi-dimensional parameter space that they do like to explore through. In this case, I'm just going to go to the two-dimensional one. So let me
close I got this is something silly. So there's a right way of doing this in the wrong way. I'm going to do the wrong way for the sake of simplicity. I'm going to do a Brute Force search, so we'll do a couple for Loops seconds type. Stew for Mexicans. In the range of. Say, 5 to 16. Every two. For my steps. In the range of. 225. Let's go ahead and run our. Run are helpful function. Okay, so I'm actually running through how many is that that's going to be 5 by If I buy 525, runs here roughly. So, it's
tracking away. I don't have to pay too much attention here because again, is all being tracked by the whole user interface for exploring the results in just a second. No. So give it a second thought. I'm pretty aggressive sweep. The right way of doing this is to do cross-validation while doing a parameter tuning using spark mlml flow with capture all that activity as well within your cross-validation runs. Look at that, man. Might have a champion right there.
Okay, that's done. Good, good good. Let me jump back into my u. I I was too low refresh. Real quick. I did and I was arranged. It was arranged as I give it to ya, 5 to 16 and 2 steps in increments of trivia. I'm so now we will do a bunch of runs. All right, right? Look at that nice. I want to compare all these guys together. Now, Okay, let's compare. So you can actually go in and say, look at 1 p.m. just Dimension at a time. Same accept vs. Accuracy. That's fine. We see the max depth of. A four is faring better but we were, but we actually were searching a two dimensional space. So, it makes more
sense to actually go ahead with a contour. Okay, so this is Mack spins around death. F1 score. Okay. So where we getting high? So, for reasons, we know we're doing pretty well. This is Max beans on the on the x-axis. So it looks like there's a few scenarios were so we trying to get to the lighter shade, right? The lighter shade, meaning accuracy of one. So life is better. So looks like these Pockets actually doing pretty well. For some reason here, this whole bit around Max been to 13 for some reason isn't faring? Well, okay, good to know. We'll avoid that for when I should do a final
model run. But yeah, we've identified certified Corners within are multi-dimensional parameter space. That that are good for this specific task going to be so messed up before and then Max beans. Anything above say 13? Good. Okay, that's that's a little demo. So with with the wizard without managed version with an Arabic, you still get all this dama flow. You, I just don't have a you just how about, you know, he's doing a localhost? I think that's for 5000. And yeah, that's it. Thank you very much. And by the way, I'm so you can find me on
LinkedIn, get hop and Twitter. I have posted all the slides and the demo code on GitHub. So if you go to this to my GitHub repo downstairs, a false, Asia 2020 demos, repo the slides and the go to Alder. Thank you very much. Feel free to.
Buy this talk
Access to all the recordings of the event
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.