Events Add an event Speakers Talks Collections
 
MLconf Online 2020
November 6, 2020, Online
MLconf Online 2020
Request Q&A
MLconf Online 2020
From the conference
MLconf Online 2020
Request Q&A
Video
Learning with Limited Labels & Weak Supervision
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
132
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About the talk

As deep learning-based models are deployed more widely in search & recommender systems, system designers often face the issue of gathering large amounts of well-annotated data to train such neural models. While most user-centric systems rely on interaction signals as implicit feedback to train models, such signals are often weak proxies of user satisfaction, as compared to (say) explicit judgments from users, which are prohibitively expensive to collect. In this paper, we consider the task of learning from limited labeled data, wherein we aim at jointly leveraging strong supervision data (e.g. explicit judgments) along with weak supervision data (e.g. implicit feedback or labels from the related task) to train neural models. We present data mixing strategies based on submodular subset selection, and additionally, propose adaptive optimization techniques to enable the model to differentiate between a strong label data point and a weak supervision data point. Finally, we present two different case-studies (i) user satisfaction prediction with music recommendation and (ii) question-based video comprehension and demonstrate that the proposed adaptive learning strategies are better at learning from limited labels. Our techniques and findings provide practitioners with ways of leveraging external labeled data.

About speaker

Rishabh Mehrotra
Senior Research Scientist at Spotify

Rishabh Mehrotra is a Senior Research Scientist at Spotify Research in London. He obtained his PhD in the field of Machine Learning and Information Retrieval from University College London where he was partially supported by a Google Research Award. His PhD research focused on inference of search tasks from query logs and their applications. His current research focuses on multi-objective machine learning for marketplaces, bandit based recommendations, counterfactual analysis and experimentation. Some of his recent work has been published at top conferences including WWW, SIGIR, NAACL, CIKM, RecSys and WSDM. He has co-taught a number of tutorials at leading conferences (KDD, WWW & CIKM) & has taught various courses at summer schools.

View the profile
Share

I will go ahead and introduce our first speaker. We have. Rishab, Malhotra scientist from Spotify starting us off. So it's over to you. Punctual. Actually, I can hear myself. I can't wait to be home in about 20-25 minutes. are electrons made of because I Do some research on Spotify on operation in March when we used to have these in person, a bunch of time. So I think he would be starting to see that when we talkin about these large-scale. If Learning Network stand, the training data is the new century. So a lot depends on what kind of these models

and specifically the more than you need. Collection entry table and start getting my tab running from here and specifically what the question you want answered is like Ali Han has stayed up for training Machinery, model input, which treat the model in terms of like what kind of feedback do you want to update? Then one thing that we have more information on the others? More readily available. Some of these might be less readily available could be very strong and strong to

drink all of the question that how do we learn from data? Which may not be that much. So often times we have scenario. We have a strong signal in like millions and billions of numbers. If you look at this example stand. We got a focus on two types of supervision signals, which is seeing about some of the recommendations we gave them or we be off to explode judges on some labels eventually, so Eventually, this Tina is not available in large quantities because collecting data from users, very expensive, but I

like that is not like the circulation and that's a lot more powerful than anything you can infer. We have weeks will be Sunday. Which is a form of subversion and then come in from a number of different sources, they are available, but it's like systems rely on clicks and dwell time as a metric satisfaction systems rely on engagement time and distance from, which these large-scale solar systems are trained on more than one today. But you're very, very strong indicators of user happiness, and use it like, perhaps learnings from baseboard. When it rains, a lot, skinnier

remodels two question, B, one to answer, if you look at what type of weeks ago, but since we are talking about, so it could be Someday, you might have already be classified as indexable Vision. Like you have a companion is a lot more easier. So you can use some type of person. Are you going outside like, answering basic emotions? Are? You can change that are represented as we clear the distributions. Right. Now, I'm going to be having a lot of Tooling & Machinery available to collect more and more forms of Civilian signals,

but I can also uses explicit judgments based on signals, but they are very very rare. A number between a large-scale end-to-end encounter, decoder, architecture user response to the right. Because I'm tired just like me, but the labels and I cannot train her like a $3,000, new network on 10020. That's the question is how do we combine together? And we were working on developing some optimization techniques. To a search web. Types of strategy. That's why we wanted to go back and adopt out of Novation. Strategies to better the new signals that some of the

others are fixed. If you look at the end of the night, but they are the red ones. Are we supposed to get in the wrong direction? That leads to some up in the training. So what I want to do is like it. Send it. I used to train. My my mother's is actually Saucony. Socks. Like this could be like I can have some similarities that I only can send it to the ground, to the example, I have. So I can have some similarity facial to Define. Which of these weak supervision day. Should I be incorporating in my Xfinity? Example could be into leaving. I

can look at the representative. I can go and look at the last few of those and then bring it back to him and it said she can like in to leave the strong bitter and some representative example. And then treated, it could either be supportive of subsets selections. Mix of condition. On. Like what like your friends, right? If you don't have any friends than the first friend will be hugely valuable to you. If you have like ten thousand friends on Facebook. It's often times. The circulation problems can be cast. As

far as a subset problem started. In terms of the specifics of motor function, which is a combination of informativeness and representative that I want to pick up examples, which are informative to my main motivation. Beef, tell me exactly. Covenant still good, are they cover the different types of training dates that I might have? Do you like there are some of these ways of getting there at 5, if you want to be able to have more control in the process and not just in the training process

in which we can do, that is why differential waited learning to cross the industry and Academia for if you have a strong data, then I will trust him. Or I will say that I have high confidence in the label of this expensive so I let it Modify micrometers appropriately, but if the data is not that trustworthy is I'm going to diminish. The changes in terameters what you can do to your typical, the confidence, you have on their specific in the notion that if you did, that's coming in from strong data, which is very rare, but it's very, very trustworthy. So,

I let it go on the pipe and replace, but then it supposed to be nice and famous iclr paper from from from Google photos on fidelity. A luxury. I mean, I want to focus more on, how do we increasingly change the momentum adoption self to give the bottom of the base, simply a fraction of the previous. And in general population. If you find them in the same direction as before, so then you increase the steps, you take two words, minimum to smooth out. The variations of your parameter of

Dayton gradients in Adams style is like a very active in recent. Conferences are looking at So what we do is review Beast, then I'll Trust if more and I have five, if I leave there. Then I won't trust it enough so I can see what you do is like we know that Adam update the expansion moving image of the first and second from the beat supervision data from previous incidents. And if the new instance is a strong supervision, I let her down to the oscillation the momentum coming in because of this. And then we dumped the contributions

from the green. And how do you dump in it? So then you can compute this moment. And if you have a label wait. How much of a trust? You have some other like talking techniques in which allow you to momentum down version of Adam. So it's, it's Adam, but now you're controlling how much the momentum can dampen Your solution based on how much you trust. Your training is the same if it's coming in from weeks, but if it's coming in from Stone supervision later than I

trust more, and then I let it do what it wants. This is summarize. The problem. I have large amount of a few view, optimization optimization strategies. Scooby dooby doo, doo case studies. So in the basket, should I be looking at the Spotify homepage and flipping you the satisfaction on the Spotify homepage? Okay, and if you look into Communications, then you're you're strong Super Bees and signaled. An example of that could be exclusive, use of it. So, maybe you if you show up

on your page and a slight are you happy with your decision and not happy with this recommendation or not? So this is telling you explicitly that would be Sunday. Like your name. Have you spend a lot of time and how many songs have you played? So these are some examples of idioms. It's like 3 months because it's not very strong in terms of representing. The user's, actual happiness, happiness. And then, like I have two forms of situation. They have it, I can use and I can apply the techniques, which we just talked about in order to train my mother question answering. So, here

you have a video and you want to find answers to a specific question again, like you cannot have judges watch hundred thousand video so you can get some sort of data from your Amazon Mechanical. Turk. I did like the model, but then, I think I'm about to be resolved, the last 30 minutes of Q&A questions in your on Uncharted. I'm cutting back a week celebration. So just looking at like what if I just use my phone yesterday. All are combined it in Miami, right? So then we say that if you only

3 weeks, it will be Sunday. Then it's open the video Q&A but it starts to hurt and you use satisfaction. And in general, like, adding Brown to it, which is a strong data said, it's useful. In both cases, eventually without any proper intelligence. Then you might just have the track below know, from the underwriting. Which means we won't be performed better than just using. The grounds would be slightly better than if you're careful in how you select you later. I said, you didn't like how you subscribe to beats of a reason, they do what you want to merge with, you go down to the

strong bitter. Then it can help you improve performance at the moment, come down version of the adoptive optimization techniques are more useful. An increase in performance. So, just a couple of other minor points, lied again, like how much of me supervision they do you want? Or I'm how much can you eat? Maybe there is like, you don't need to maybe consider all the different 10 billion pics of examples. And then maybe you'll start seeing somebody was in the paper be also, slightly talk about that bass on the

domain. You have a problem setting, you might need some large-scale. You said, maybe some of these are enough to do. You call two different tasks and real person answering are completely two different domains, right? And here we are seeing that if you combining your beacon Sunday. Which is why shouldn't I be doing that? But then like how you doing it, you think that some of these methods of adaptive optimization are useful across this like two separate apps?

Taco Bell near Myrtle Beach. We might use for a recommendation. If you have thoughts which are not exactly. Be start. If you have like a combination, I like maybe maybe more like we might be able to down screen application. Messenger. You should look at the exhibition teams are not that high then perhaps you don't want to go to momentum down techniques for some reason to get more and more wins than perhaps. You can swap your Optimizer and get the games which you need to get it. Says selection is

nice and kind of pretty helpful in increasing performance. And of course, like the kind of house woman them damn device and then you are very careful and how much if you're allowed you're allowing your weeks. So you can find me a butt. I have to get the questions. I'm going to look up. What if all I want? Sofia supervision example. What's a tradeline? Distribution limiting your ground today, right? Since that's the case, then then we should rather look at can be generate more

weeks from maybe some other sources, some some bias classifiers or if you train, A model for your negative labels, which perhaps you might not have any weeks. The question doesn't make sense to limit the amount of weed killer that you use in relation to the emergency room. They are asleep. We did find evidence that in some of the experiments is unnecessarily polluting between examples, unrelated. Kind of like people do this only stopping right and you want to be able to message me, you don't want to keep on learning not useful aspects of your mother. So I can like Review starting to see

black doing effect of the week's movie, Sunday night, and sometimes like the phone with you. If you cross up trust you and I need to start using red-light unhelpful, examples, a neutron accept.

Cackle comments for the website

Buy this talk

Access to the talk “Learning with Limited Labels & Weak Supervision”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “MLconf Online 2020”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Artificial Intelligence and Machine Learning”?

You might be interested in videos from this event

February 4 - 5, 2021
Online
26
104
ai, application, bot, chatbot, conversation, data, design, healthcare, ml

Similar talks

Meghana Ravikumar
Machine Learning Engineer at SigOpt
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Nitin Sharma
Senior Research Scientist at PayPal Risk Sciences
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ilke Demir
Senior Research Scientist at Intel
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video
Access to the talk “Learning with Limited Labels & Weak Supervision”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
949 conferences
37757 speakers
14408 hours of content