Events Add an event Speakers Talks Collections
 
MLconf Online 2020
November 6, 2020, Online
MLconf Online 2020
Request Q&A
MLconf Online 2020
From the conference
MLconf Online 2020
Request Q&A
Video
Generating Adversarial Examples & Defense Methods For Online Fraud Detection
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
236
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About the talk

With the temporally evolving and diverse nature of fraud patterns in online payment systems, optimizing explicitly for recognition and defense against such attacks has assumed greater precedence. This talk introduces a cohort of domain-oriented adversarial attacks based on poisoning and evasion techniques. We further demonstrate the effectiveness of such cohorts in surpassing the attack capabilities of isolated attack frameworks. To proactively neutralize ensemble-oriented attacks, defensive procedures are demonstrated. Lastly, key challenges and applications in effective deployment of such models in large-scale fraud detection systems to identify temporally evolving attacks will be broadly addressed.

About speaker

Nitin Sharma
Senior Research Scientist at PayPal Risk Sciences

Nitin is a Senior Research Scientist at the AI research group in PayPal Risk Sciences, where he focuses on end-to-end design and development of AI algorithms, particularly deep learning, for large-scale real-time payments fraud detection. His research involves the next generation of fraud detection capabilities by designing novel fraud problem formulations, utilizing the exhaustive PayPal data assets so as to improve fraud detection accuracy while continuing to enhance the experience of good users. Prior to his current role, he built large-scale machine learning frameworks for stolen identity and stolen financial instruments fraud detection at PayPal, with several years of research & teaching experience in machine learning and mathematical optimization.

View the profile
Share

Basically, this presentation is going to be about generating Ensemble bass down. In this process, be able to try to find out what is so peculiar about the fraud detection landscaping, and what makes it different and interesting from the application? Now part of your federal tax problem and why? And what are the kinds of ideas that process would potentially use and Candy stimulate those ideas in the inside of a lab environment and try to stay ahead of them in that sense and assimilate those

attacks in this process will learn how to generate realistic attacks that will expose vulnerabilities in existing Ford models. And then we also provide limited edition around defense methods to what would you really do to prevent or circumvent those kinds of attacks, especially in the context of Temple deterioration, which happens for fraud detection, Motors of time, you might see the door. And if so, what kind of Defense methods can be used in that case? So we can review

the agenda. You just provided a little bit information about the problem background motivation quickly. Go through, on the address of the lawn and core concept and formulation more, specifically focus on the techniques that have been used for. This particular, top methods are used Indian, sambar, accept a gradient best methods of optimization methods and the generative design on somebody and then use the Ensemble best friend as a seed, and then try to develop defense matters to Sacramento tax. That

are pretty effectively carried out by The Ensemble and look at issues related to Temple Run. So very quickly if we actually look at the problem of fraud detection, and this is also the reason why and Ensemble is applied in terms of other one is it is represented by a hydrogenous, large-volume payment. Ecosystem means you have people coming on to play for me and they're trying to do a wide range of activities, right? Time to buy and sell things. They're trying

to just send money that conduct in cross-border transactions. So, you see a lot of hydrogen in the type of transactions that people are doing. And so, it becomes harder to actually optimized for a context where you generate adversarial examples of tents that are you trying to catch as much as broad as possible. You're also trying to approve as many good users as possible and as quickly as possible. I want to balance robustness with performance. Do you want your mother to be as relaxed as possible? In terms of timeline?

How do you catch a certain percentage of fraud? Volume, look at local optimal optimal Solutions. So you have that is so going to solve the problem all over the world. And then you have these strategies that are in place to increase, the moderate resolution may be on some segments. And so and so in this case is also this model and strategy interplay that actually makes it harder for us. It's a balance problem. And lastly, there is no men Centric. Knowledge and strategy as well as

very complex segmentation. So, so the reason why does hydrogenous as I was mentioning, what is, what is people were transacting on PayPal, systems could be personal account and, you know, who is actually interacting with the ecosystem and then, and then what are these? People are using PayPal transactions across different countries, offer different Financial regulations, apply as well to look for optimization that happens across the different different kinds of subject. I enjoyed it also makes address so that I can Envision a challenging problem

problem. We are looking at close to three hundred and five million active accounts. 2019 pick up as much higher. Now, a post covid and most people moving from people. A sex change to online exchange also, 12.4 billion transactions and then Michael currencies and multiple finding instrument in the types of Rod Titans. This would mostly be a look at it from her point of your classification. You have basically a motivator distribution and x and this is hydrogenous because it is essentially a mixture of multiple probability

distributions, right now. Probably this weekend, but at the same time you also see that. The weather tagging is conducted in fraud detection. Two men also exposes a wide range of product demos on, what are some examples of the process would technically, steal somebody's credit card or bank account and then left and channel them into a fake PayPal account. And start tomorrow, whether this is really happening from a school and potential versus people actually stealing identity where you

might take or somebody's PayPal account, and now you also have access to the bank account numbers and information with each other in order to game the system. So the the distributions But also, in terms of the rock pythons that are generated by the system. So how big a problem is this really? In terms of obtaining credentials online credit card, email and social media accounts that are sold for a price? And they go according to recent prices and so for visa and MasterCard, you might see that they're going around $44 or $7 a month,

raging stolen instruments. And we have historical data related accounts that ended up using stolen instruments on the patterns that actually emerged out of this Market in instrument named the credit cards or even so Accounts, which are all the email addresses that are used to create a PayPal account. So good. So having presented sufficient contacts related to The Domain. Let's switch gears a little bit and go back to the adversarial learning contacts.

So just a quick idea here and what is perturbations do is for the board of missions are deliberate and they are too careful and lead to an incorrect classification prediction with higher confidence. So you make a small change to the transaction that comes through PayPal system and the newporter Inn in such a way that the transaction that fire classified was the decline or vice versa, which is actually even more dangerous. So you might might have potentially could actually declined. And now through this systemic infection approving, this transaction. So

the idea is to actually add a to-do to introduce motivations into into the context of school and you just add a vector with elements equal to the sign of the elements of the gradient of the cost function with respect and you're basically back propagating the loss in order to find the optimal directions on the weights. So that you minimize the cost of the problem is flipped, in a sense that now you have to keep the Martyr syringe and Olympia in case of black box of white box matters. If you let the weights are the conditions of the model stayed constant,

and now you die to optimize for essentially the function or a TextMe message. Election in which you could switch your data. Let me look at what this means to production, right? So often what you see is what does in order to avoid being detected by changing status quo, so they might constantly change the assets that they're using in order to login to my Paypal account changing the IP address. So they might be changing or the specific IP address. Who is there trying to login?

Sometimes you can the browser in some cases location that that might not be that far off from where they're currently doing this transaction channeling through different funding instruments. Let's go see if they could get to get to the system in which they change. These are constrained by a little bit more about it. Where they might start off by performing these small dollar value transactions that are successful and this system. I trust the system. Now, it starts

to dress switch, the order, Outlets a VIP, or the device. And once enough trusted establish, then the professor tries to capitalize by doing a large value dollar value X action, and this ends up being pregnant. So the good idea here is can be generated versatile examples to fool the production model into approving, a transaction that was previously declined. And then if you want to use domain knowledge, in order to generate attacks, that are realistic that, you know, that these attacks can actually occur and within the scope of

The other two questions that you want to answer that one else. Can you, can you develop a monitoring and defense system that can learn and make them more robust? And lastly, as I mentioned that by building trust with the system and then set it up because of the transactions that he did by doing the small transactions, which ended up being successful for them are using videos generator. So let's pretty quickly. Look at what makes this problem. So if you have a classic image or a speech Vector, you could apply

for division on top of that. And then you put SS either visually or, or or or or or, or whether this is a realistic. Instance of division is going to stick on a location. That is us. Now, let's say you perform a heart ablation scheme where you try to part of your input back to maximize our correspondent that ends up coming as a result of vision, is to change that that a change which is the country code of u.s. Who may be another country code and that might represent less another country like Italy. So what happens is, You original

raining in a way that makes the attack and realistic because the geolocation may not be possible to change the location within the time in the time range that the transaction from one country to another happens. It's visually or even from them from for that matter. And that's the reason why we have to go back through the domain and then I'll figure out how we can constrain the Puerto Rican space. So it generates examples that are realistic. What can the Frosted change, right? And then if the roster changes certain contacts changes IP now,

so you can figure out what kind of Ibiza, Broadstone might have logged in through the timer on five of the account. So, the day that you have manually engineer teachers, that might use, IP is a key example of this would be the number of transactions that have emanated from this in the last 24 hours. So they could be a very simple-minded engineer teacher. Now, if I change IP, it also impacts the value of the corresponding manual engineer teacher, and then it throws you into the dream of water.

And, and Something. If the first I had used an IP that is potentially more hazardous. How would the manual engineered picture better actually change? And so these are exactly the types of questions that make for a very interesting from the point of view of adversarial example generation. The best attacks is basically divided into attack and defense methods are they specifically on Jessica evasion methods and limited lead poisoning? So, the motivation for using an ensemble best method, as I mentioned earlier, is the fact

that its Market position problem. Maybe you want to catch higher dollar value fraud, and that's what makes a problem, interesting, and sometimes across different customers who might not have much appetite for it. So you might have to adjust the risk sexual differently for him as compared to someone else who might be in the system for longer and is looking for that are concerns about catch rate. So you want to actually have a high castrate, but also you want to decline, your court uses in, in

a way that as often as possible in our experience. Different times of actually, discovered different motivations owns automation regions and the baby formula. This problem is to discover a maximum set, cover of all of these Provisions ones that actually maximize The exploitation of vulnerability by itself, might be little to some extent. But those methods of all these motivations owns that are discovered by different regions and then use those who attacked them

with more accuracy as compared to each individual matter. And so this also, generates reliable and realistic up alterations at which is also what are the error rates in generating these realistic motivations, to some methods might be pretty quick to compute, but the accuracy in terms of generating real motivation, for us to use an ensemble, best methods to discover these different motivation regions, and then bring them all together. So did that so you use a bunch of methods? But what I want to do here is just Temple of you. So that gives you sufficient infusion

in terms of why we are using the method that we are using. So he used a wide range of methods. Actually outlined them the design, but I'll actually cover maybe one or two classes of each so you would know exactly what we're doing is based on a gradient, which I already described a different computer and what kind of vision from your class. So you increase the lost and then allow them or have them to actually come up with the idea of the direction and magnitude. Find the direction in which the

function fairly quick and all you do. All you do. Here is University at the sign essentially offer of the gradient relative to the input and then you use that sign to effectively. We tried different variations of the tops of the First Methodist died because of a sport divisions. It real that can you import a patient's that you could use accessory? Not when you look at the division Attack Base methods based on a derivatives are based on DL by Dax the first one. Is there a quick way to really love someone? You can about it is

basically comes up with the population in one, shot in the consignment or actually does. And I'm so you don't get too but could potentially be premature but a vision you actually systematically look for the division as a sort of a search Pro. Sorry to interrupt, but we thought we'd give you a five-minute overtime or maybe a little bit more because you're talk actually ends up to buy a couple more minutes if you can drop it off and then we'll take care of any questions if if we have them. Sounds good.

And at least likely method in order to basically constrain a variable or or from the bank, or is another who created signed a method basement as compared to yesterday. The tour category of method is essentially the generator method. We don't typically be either the examples that could be, that you can apply to the today in Spencer Street. The last category of matter that we actually also use is called Universal, adversarial perturbations. So, here we are. Coming up with perturbations data

input, agnostic. Show up for this problem, applied essentially, the bank, fraud scenario dance. And then what you did was inadequate, funds in the account and bank accounts, or non-existent bank accounts and be adversarial, Ensemble generator method across different optimization criteria, for high-dollar. Optimization here is actually two or two of The Roc. 80% of your receipt and get the most promising for divisions in a region already have a very high pain,

but can we change doors and get the most? I get the Best Buy off. So, so this is quickly, a compelling individual methods verses that are seven different types of methods. And The Ensemble, the issue that you see is really the ratio of the model performance after motivation. And original model, performance has deteriorated application of the adversary learning method is actually higher as compared to the individual. So then the second part of the problem

is is an idea where the champion and alongside and then you track them. But I'm so what happened to Billy is the adversity, example of debate continues to track those changes. And then loans, the gaps as you're confusing but time as you are actually pressing forward in time, this adversarial generator that you could use in order to retain your mother. So the process is a method for a combination of and online, hard example. We're celebrating our second. You also allow of supposed to be send all the examples and then and then

the only back propagate be adversarial example. So the gradients now only, I just to basically better being able to listen to be selected back, propagation, teatime figures, and this is still a work in progress. We do see that? That is a consistent in, please. Yep, ocelots, like very quickly. So, in terms of conclusions are very quickly. We see that this cohort of examples can be applied to domain specific protection and to see these never generation of realistic or division, which is a problem and

acting aggressive defense mechanisms as we go along. Thank you so much. I'm presented with us. Any questions quickly. This is our last session before break. So you have nothing here. That anybody would like to ask him. The other thing is that after the break Chris, Ward will be your empty and he'll be introducing a few stalks and the remainder of the session, so there's Chris there. So I, once you guys have a break up to you, when you want everybody to come back. I mean the

schedule was 2:15. Maybe we should buy 45 minutes to 20. I suppose you just don't want to step on his toes too much. So and maybe you'll be around for a few more minutes at Newton during the break. If people have questions for you in the chat. Yep, that sounds good. And if there's anything that, you know, a little bit more in-depth, happy. Yes, it seems like that's a question from Gigi on. This is about yesterday and I can't even, I shed this dick. I also have a reference as section. So you could actually refer to the reference section, three has some Foundation of

papers that went into are some of the pointers that could be useful for you. But we also have a Blog and a vlog on which we publish articles, and it is also a link on the presentation. So, if you do plan to post, right up to this particular method in details in the coming weeks as well.

Cackle comments for the website

Buy this talk

Access to the talk “Generating Adversarial Examples & Defense Methods For Online Fraud Detection”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “MLconf Online 2020”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Artificial Intelligence and Machine Learning”?

You might be interested in videos from this event

February 4 - 5, 2021
Online
26
104
ai, application, bot, chatbot, conversation, data, design, healthcare, ml

Similar talks

Dan Gifford
Senior Data Scientist at Getty Images
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Tianshi Gao
Principal AI Scientist at Cruise
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Rishabh Mehrotra
Senior Research Scientist at Spotify
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video
Access to the talk “Generating Adversarial Examples & Defense Methods For Online Fraud Detection”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
949 conferences
37757 speakers
14408 hours of content