Table of contents
About the talk
Ethics & Society: AI and discrimination: How fair are bias detection tests?
The EU commission recently published the Artificial Intelligence Act – the world’s first comprehensive framework to regulate AI. The new proposal has several provisions that require bias testing and monitoring. But is Europe ready for this task? This session analyses fairness metrics according to their compatibility with EU non-discrimination law metrics, and whether the use of ‘bias preserving’ metrics require legal justification when making decisions that have real impact on people in Europe.
Dr. Sandra Wachter - Associate Professor and Senior Research Fellow - Oxford Internet Institute
Brent Mittelstadt - Senior Research Fellow - Oxford Internet Institute
Sarah Drinkwater - Director - Omidyar Network (Moderator)
Brent Mittelstadt is a Senior Research Fellow in data ethics at the Oxford Internet Institute, University of Oxford, as well as a Turing Fellow and member of the Data Ethics Group at the Alan Turing Institute, and a member of the UK National Statistician’s Data Ethics Advisory Committee. He is an philosopher focusing on ethical auditing, interpretability, and governance of complex algorithmic systems. Brent also coordinates the Governance of Emerging Technologies (GET) research programme at the OII, which investigates ethical, legal, and technical aspects of AI, machine learning, and other emerging technologies.View the profile
It is my pleasure to introduce a drink called him early. Thank you so much. A big and exciting topic today. Fan of discrimination than operationalize. The what? And I'm doing good. I would like to welcome professor. She is associate professor and Senior fellow at the Oxford. Internet Institute, what her research focuses on the ethical implications of big data and Robotics. Welcome Sandra. Thank you. This is focused on auditing and the other governments, a complicated question. I know that you both have a slide.
I'd like to look at 3 to kind of ground the conversations their hands. Okay. Fantastic. Thanks so much. Thanks for the introduction. Thanks for having us and it's nice to be back. I was just here to run me through some slides and basically the idea behind one of our most recent works that deals with AI in discrimination and exactly how fair these different virus test are. And I'll be handing back and forth with sander for different parts of the slide deck. But just before I get ahead of myself, just the ground things a little bit. So what we're talkin about is two things. It's how
well does technical work around by. It's aligned with the law. And of course, what does Allah look like? We can't talk about the law without talking. Also about the most recent proposal for an artificial intelligence act from the European commission is notable because it's yours. First real attempt at a comprehensive framework to regulate, we have the Jeep jar, of course, but a i and so it's going to be relevant and not only to member states of the Union but internationally as well and of course it is a draft. So it's not clear exactly how it will pan out or what it will actually required
practice. But at the very least, we can already see that. There's a focus on issues of biased article. 10 here, proposed article 10 specifically calls for deployers to examine possible biases in their systems. It creates requirements and other parts of the regulation to examine these biases. Where to test mitigate or prevent them? Answer the question we really have is do we have the necessary tools to actually meet that future regular regulatory requirement in practice. Mount vendor in the night. Luckily for us, we happen to write a paper before the proposal came out, that it
dressed pretty much exactly that question. We were asking the very similar questions are the biased and fairness test that we currently have in the medical community legally compatible with the requirements of European law. In particular non-discrimination law. That seems like the closest thing to look at we were talking about fairness and bias and to give you the conclusion of our story of the conclusion, our paper at the start, many of the ways that we currently measure fairness in. Technical terms are not compatible with the law. And in this side, x in this Mission, were going to
explain why that is the case. Supposed to start off with a general distinction between types of ice is in this is far from being clean, distinction is very much an overlapping distinction. We think they're too at least two overlapping types of vises. So we have technical biases inside. The boxes are problems that arise from applying machine learning Rai cell and do some additional biases that are not directly represented in the day. They're using. So the training day that you're using with your system. They reflect some sort of failure to
predict outcomes with the same accuracy across different groups, different protected groups. And so they lead to skewed inaccurate, or otherwise, unequal outcomes for these different types of groups. No, not all biases in in machine learning and AI can be traced back to technical sources or design choices. And so, that person buys comes in which we Define as any type of systematic preference to make positive decisions for one group of people relative to another or class of objects to another compared to technical devices. Very difficult to fix because they're more a
matter of politics, perspectives, shifts and prejudices and preconceptions, that will take quite often decades to change and unequal out. And I think it's important to recognize the unequal outcomes. When are using AI, are not necessarily the result of some inaccurate or incomplete data rather. We can get unequal outcomes that are actually ate accurate reflection of the bias world that we live in. In the bias World in, which AI is learning from, and in, which is being used. Yes, so to bring this, actually down to two, a couple of examples
from from Yakima to concrete. Here's an example of technical glass is Brent, just set with technical by the problem stems from a technology itself. Many of you will be fully aware that face recognition software is less accurate on faces of people of color in women, why? Because predominantly the data that is being used to train those face recognition, software is based on white male faces. So obviously an algorithm that is Florence to see. This is the normal face will be more accurate one, white male faces than anybody else. The distance an example of technical by is the problem stems
from the check itself. And we have societal biased, which shows you that the trouble stems from the human, not just a technology. So, for example, if you having somebody that is recruiting and recruiting officer and this recruiting officer has ablest assumption steel for, he's not giving jobs to people's disability, this decision, pattern office, hiring officer. Will, then be used to feed. You already learned that ableist behavior. And then in the future, people, applied, people disability will be rejected from society by us as a,
clean-cut is not fully hear some overlap. There's a reason what a dataset strain to face recognition software was predominantly based on bike spaces to begin with. So the people deciding goes to sign choices, reflect the power structures in our society, there for society to buy it, but the distinction is still quite helpful because it shows you the aspirations of people into feet. Of what they want to fix. There's one set of people that wants to fix the technology and then I said of people that wants to fix the society. Still, let's have a look at what the law wants to do
and let's look at San Jacinto discrimination law. I'm actually brings me back to you very, very personal story. When I first get introduced to the idea of what has actually means in the law. I must have been around six or seven years old, when you're ready, very short story in school and Dia. The piece was called twice judge. So do I stretch centers around the story of two siblings? A brother and a sister fighting over a piece of cake and if I do write inside a right? And they cannot decide how to divide up that piece of cake. So they don't, you know, why is Judge and asked judge
for advice and the judge says, walk on the brother gets two cuts, the cake first and the sister gets to choose first and I still remember that story realizing what an elegant way of, making sure that everybody gets an equal piece of cake. But discipline is able to solve such a complicated problem in such an elegant. Why I really want to be part of this to this day. Starting, I think most of us will agree that it is a very elegant way to think about that problem. However, what if I told you that the sister hasn't eaten in 3 weeks.
Would you still think this is a fair way of dividing up the cake and probably many of us would say, but probably not and this type of tension, the type of conundrum actually goes back to what the law things fairness mean, the different types of Spanish concept that we have in the law. Roughly. We have two types of ideas of fairness. We have formal, equality, and substance of equality. Equality means that I'm trying to treat everybody equal. I'm closing my eyes to race. Gender, sexual orientation.
Everybody gets the same size piece of cake, right. This is the idea of a formal equality. Substance that the quality or de facto. Equality is different, in the sense that it does not want to close your eyes. To the differences between certain groups, has to take into consideration. That some groups are more hungry than others. And therefore we might have to divide up the cake differently. So do the two types of ideas of fairness in the scholarship in the law and Lyndon B Johnson has actually summarized it in a wonderful
quote. I'm perfectly better than anybody I've ever seen. And I think it's very inspiring and he said your City address. You can just take a person who for years has been hobbled by chains and liberate them, bring them up to the line of the race and then just say you're free to compete with all the others and still just a belief that you have been completely Fair. It's not enough. Just to open the gates of opportunity. All our citizens must have the ability to go for go skate. And this is the next in the more profound stage of the bitter battle for civil rights. We seek
not just freedom for the opportunity. We seek not just legal Equity, but you want to build a noxious. Equality is right in a fiery, but equality is a fact that I called you as a result. And I think that sums it up very, very nicely. And this idea. I dispose off a new concept of equality and fairness actually found its way into the heart of non-discrimination law. The law ones to rent two types of discrimination. It wants to prevent direct discrimination and indirect discrimination. So direct us to Nation means I'm treating somebody less
favorably basement protected. I tribute that they possess. I'm not giving you the job because of your skin color. Are you sexual orientation? You play tackle, the latest. This is prohibited in most cases because an airline's more to formal equality because everybody ought to be treated differently. List of there that if a seemingly hear my practice is apply to everybody equally and it just so happens that it poses a protective disadvantage on the protective group when compared to similar situation that this could give rise to on the
face discrimination discrimination. So roughly, what that means is that you are playing something that is actually not unfair. If you first look at it this way, for example, if somebody's hiring people that are is deciding, I'm only going to hire. People are taller than two meters to very tall, you know, what? Height is only protect the attributes such such as race or sex or gender, whatever it might be, but you will know if I have a height requirement for my job that it will, at least it is a Actually affect women because I'm average, we are shorter. And that
is the idea of indirect discrimination, right? You are acknowledging that some groups have a tougher than others. Actually, the idea of protections against indirect discrimination was created to show where inequalities exist because the underlying assumption is that we are all equal that we're all the same that we all have the same abilities. So if the outcome is an equal across groups, and there must be something wrong with your system, not two people. And therefore, this idea was created to really bring about substance of
equality. It's there's a diagnostic tool that shows you where social engineering has to happen. That shows you where social struggles are still going on and that shows you how to dismantle actively in equality in our society said, this is the idea of a substance of equality. In fact, this idea of taking an active part in society, but both the private and the public sector is something that the law dictates that. Once you to be an active player, not such a passive bystander in actively, bring about social change.
Leveling the playing field substance of equality. Really means Dismantling in the quality. Distributing resources, thinking about that. Everybody gets the same access to social goods. It's not just about economic disadvantage. Is also both cultural and social rights. It's about social inclusion, solidarity and participation in our society. So you can see that the law really wants to fix Society, but the question is, how does Tech fee fairness? Yeah, and that's that is a very distinct question and it speaks to
almost a different set of communities that are working maybe with some, some understanding of what the law requires potentially. In different countries under different legal Frameworks, can lead to very different requirements in practice if they're being attacked to pay the 10th at 2 at all, but to try to make things simpler to try to find some sort of coherence between non-discrimination law and the technical work around fairness and bias. In AI, we propose a new classification scheme for fairness metrics in machine learning today. I buy fairness metrics.
All, I mean, is a statistical way of measuring fairness in practice, know, our proposal and its classification, system is for the distinction between bias preserving and bias transforming metrics. If you're interested in the the definition in more detail, we go into in great detail. The paper. I'm very much as giving you use for the headline. You here, so we reviewed popular ways of measuring fairness, popular fairness metrics based on the state-of-the-art and we created this classification system, the basically reflects how different metrics deal with societal bias and
how they align with the aims of non-discrimination law. In the whole system is based around the notion of conditional Independence, Johnson detail here, but it's in the paper as preserving metrics is what you're doing. Is two metric will seek to replicate error rates that are found in the training data or found in the status quo in the outputs of the model that is being trained. And so we can say a metric is buys preserving if it's always Satisfied by Perfect classifier. That exactly predicts Target labels with, zero are replicating bias present in the data. Basic idea here is
you have a set of biases within the data you view. The status quo is neutral, and if you are craving, Crossfire the been perfectly, reflects those those biases within your training data. You have created a Acer Aspire, any metrics that the tooth As a treat that's the status quo in that neutral, way would be considered by is preserving in contrast metric. So don't take the status quo for granted, but don't treat the status quo, as a neutral starting point for measuring fairness would be considered buys transforming. So those are metrics for, they're not concerned with
replicating error. Rates. They have different names, but typically, they're looking to match decision rates between groups. But, again, basic idea here is buys transformetrics, do not take the status quo, as a neutral starting point and not doing that. At least in the eyes of the law is very important because we know the status quo is not neutral fall Handbags and we're here to talk about that is not neutral since we Limited time here. I'm only going to talk about one example. That I could do more into paper to to talk about this. I want to use the example of Grace
and creating in general. And I think most of us will agree that whenever you being asked to show you Grace, we can all agree that there is a subjective Elementary grading in general. You can think about many, many ways of spies creeping amp, but what about math grades, you could make the argument, there's ground true when it comes to math to enjoys for Dad is not much room for interpretation of the word to give out. Let's save fellowships or University admission places based on math grades. That is a fair Criterion to assess Merit.
What you might not know is that there's interesting research from 2015, that shows that Midland High School teachers, assess the mathematical skills of boys. More favorably than girls, even though they have the same. If not higher abilities than the boys in your class. They actually get worse Grace and boys, even though they're just as much more than the boys there. They got less mentorship, which I was early to the fact that their crates are not as good as two boys and their lesson courage to take up stem subjects, lights are on in India, Korea past that gender
bias travels with women into the job market. Interesting, research shows that if you send a two batches of identical, resumes to open job advertisement and one batch of CDs has female sounding names and the other has male sounding names and the rest is completely identical, both female and male assessors of those job for typing jobs posting. Mark women as less qualified than men even though they're completely identical. The less likely to be invited to a job interview if they're actually proposed a
job. If you're offered a job, there salary is much lower than they want us a male counterpart parts and they're very often overseen for promotions. The same. Gender bias is reflected in reference letters, were women are being described as hard-working and team players male male colleagues are being described as Geniuses and then Trail Blazers and you can say well this is just a matter of Education, jaindl buys. You just need to re-educate. I'm people making important decisions, but the problem is that the idea of gender roles is actually created very early on in
our lives. So far example, there is research that shows that That if you show children by the age of six pictures of boys doing cooking and sewing, they will miss remember seeing a girl, so the age of six, our children already have a pretty clear idea of what gender roles are supposed to look like in our society, but the problem is, I does not know about that. Guy doesn't know about the social story between the data points. And I'm pretty sure most of us don't know about the social story behind today, two points. And now think about how often grades and
reference letters and salaries are being used as an objective criteria, to make decisions. Think about how often your grades open up. Doors to universities, two jobs to fellowships. And reference letters were used to give out housing or jobs or loans. Think about how often we use the fact often. Somebody has been promoted or salary and equated with a measure of Meriden success. Think about how often salary is being used to decide if somebody should get insurance or housing. What type of advertisement AC at first glance. We think that we dealing
with their data. We spank that you have some information about equal ground true, but the problem is the status quo is old but Mutual So, then that brings us back to our question that we're going to conclude with here, which is how exactly coming back in style Fair. It worked with the wall canvas technical work actually, support the substantive aims of the law, which is to address existing inequalities in society. It's a generally, we can see how the two different types of metrics will align with the aims of the law.
So let's go back to the underlying assumptions of fairness metrics as preserving metrics are more akin to formal equality or the idea of not changing things or cracking for existing in the qualities. Where is bias transforming metrics are more akin to substantial called again, the idea of changing things are actually trying to address the qualities in order to achieve parity and inclusion between groups, keeping things as they are treating, the status quo is neutral and not just simply seeking to not make things worse with a. I is not good enough.
Again, in the eyes of the law by design by preserving metrics, run the risk of freezing or locking in existing social. Injustice has discriminatory effects, which does not align with the The Quorum of Yunnan stormation, Mark, which is to change society for the better and Achieve substantial quality in practice. If we ignore the reasons behind existing inequalities, you run into a problem. You need to understand why decisions were made in a biased way. Historically, in order to correct for the inequalities. They created going forward. And so we're trying to call attention to this very,
very important decision that is being made routinely by developers in two players of AI when they're working on fairness, which is in choosing how to measure fairness. You are essentially saying we are going to do something about the status quo, try to improve things or we're going to take the status quo for granted as a neutral starting point and just try not to make things any worse. So how exactly how widespread is this potential problem? We're saying bye is preserving metrics do not align well with enormous formation Mark, but how common are they actually in practice? Again? We did a
review of popular fairness. Patrick's in AI in the technical literature. I'm not going to go through these metrics are formula. This is taken from the paper. But our headline finding is that of 20 of the most popular technical fairness metrics in a 13 out of the 20, or two-thirds of them are biased preserving and is very significant because using buys preserving metrics to make decisions that are driven by a. I will create a legal problem for deployers of AI. It's also somewhat of an unsurprising finding given that. So much of the technical work on furnace and AI comes from North America,
where you have very different anti-discrimination, legal Frameworks, which Berry in terms of their names across four months and equality. They just to help help out regulators policymakers and deploys of systems. We also thought about how can we help make the right choice or on the metric that you're going to use to measure fairness? What say you want to do the right thing? Which metric should you actually choose? We tried to boil down or message to as simple as a format is possible by literally just giving a checklist for beyaz preservation where you can answer these questions and it will
direct you to using preserving or transforming metrics based on essentially the, the the existence of historical social inequality in a given decision-making, contacts are in a given you space. And this is in the paper and it will run you through the core of the idea, very simply in a way that will lead you to one of the other. And again, this is for using AI to make decisions. If we're talking about testing, for ViaSat the different conversation. We're both have met the both types of metrics. I have a role to play. And I'm watching. All I mention is that in terms of
which metric to choose before we ever wrote this paper on by his preservation. We did some work on the automation of fairness and again non-discrimination law. And essentially, why under you non-discrimination? While you can't automate fairness in. The way that might be imagined was a lot of the technical work on fairness and a i and we propose their own fairness metric called conditional demographic, disparity that aligns with the aims of you not discrimination law and in our language here is a by transforming metric Century. What we proposed in that paper is that
deployers should be publishing summary statistics that will provide a baseline of evidence for all the people involved with using and affected by a i in the given context. We should we propose that because in practice, when non-discrimination cases go to trial quite often. There's a huge gap in access to statistical evidence of an equality and also the expertise to produce that evidence interpret that evidence into the idea here is with the summary statistics. You're providing a neutral starting point for everyone involved and you're helping the Judiciary companies and others that are
trying to make a ice bear and practice to show easily that they line with the core ends of the law. Essentially what what the metric is doing conditional demographic. Disparity is doing is ensuring the cases. Autumn and discrimination are assessed consistently without saying what is normative. Leaf are in any given case, that's always a decision that should be left to human judgment on a case-by-case basis, which is essentially have non-discrimination laws currently practice. And it just reminds me to say that even though I were talking about you, you non-discrimination law,
the nondiscrimination Frameworks in the UK, even post brexit are still How much form from based on the EU framework to this is just as relevant for practice in the UK's. It is in heat. You at this point. This is the math of conditional demographic, disparity. I'm not going to explain it. It's just to say that it's there in the paper for anybody that wants to use this in practice and the metric and our recommendations around it. Have been picked up in a number of different policy reports by bodies like the European commission World economic Forum Center for data, fix an innovation. And we've
also seen it implemented in practice completely independently of us by Amazon, in their sagemaker, clarify tool that they just implemented, I-44 Amazon web services, which is fantastic. Because it means that our number of other metrics are available for customers of Amazon web services, but of course can also be picked up and used by anybody. If they want to actually Dubai's testing according to European European legal stand standards, which again will be very important for a We putting the artificial intelligence act into practice and also to say all these papers are
freely and publicly available. We're hoping that there will be further implementation of it in the same vein. It's about will finish their hope. That's a good Baseline for the discussion. And yeah, I really look forward to the discussion and your questions. And thank you so much for for entertaining us. And for for listening to this extended talk. Thank you so much. And Incredibly an incredibly rich and thorough presentation. I love them both look around. We don't know yet. But
at the same time you have to assume given to this is the first kind of proposed Global spring Waltz and that it will have a knock-on effect in other locations around the world ATC is the color of the opportunities in the limitations that I used to be stuck in this tool, and limitations of all the limitations and challenges of this piece around by us, but we'll see if you're looking at the proposed rule. What do you see if the kind of pros and cons? Maybe I can start with that BS, a great. So again, I'm a, I'm a huge fan of the fact that we are
going this direction and that The Regulators in in your beauty, see the need to do something and there's still time to to adjust. So I'm better. I'm going to say is obviously just based on the current state of affairs overall. I think it's that fantastic, fantastic. I think one thing that I would like to, you know, Is there was one thing that I would like to learn from the word that that, that we have been doing is to understand that how diverse bias and unfairness actually is. So did the idea of having a once it's all solution,
is something that is not possible more aspirational. So the idea that you can just talk about one type of fan as a one type of Injustice is just something that doesn't really exist way. I'll just for my nation operates, the symptoms off of discrimination of very different in Germany and UK to India. The global shop in general. Do you ask rights of the idea that you going to have one particular tests that does all the work for you is something that is absolutely not possible. So in the same way that you are, you know,
is if people lived between the ones and zeros and as well gray area that we have to embrace and the law actually likes. Delilah wants to be flexible and that's not necessarily something that computer scientist and Bryce. So I might be working with the scientist for a very long time and I would like us to to learn from each other. That the idea is not to find a Super Bowl. That does everything. But accidentally learned to embrace diversity in in in that sense and also acknowledged that the struggles and reproduce and disadvantages of people are very Deborah's depending on the
context that we probably have to have very different systems depending on where we deploy our work again, why? One of the reasons why we need to pay my wife? Janice cannot be automated Metro say, and that's a good thing. Trying to keep the human in that context alive. I think it's the most important thing for me. But yeah, I'm talking too much. I'm not I don't think I may have felt sad. I think you perfectly captured what I was going to say as well and the emanation existence
and the challenge of of regular I think even looking at the EU, like, you know, Brent you mentioned the hell difference combination is in North America versus Europe for example, but even within Europe, there are incredibly different sets of challenges, no one's behavior is across the various countries, and I think that's something really interesting to me, and how we, how we at once. I have these aspirational pieces of Regulation to come set the pace since at the time, little said, thinking about how we operationalize this, in a particular for me,, I'm quite interested in me. I'd love
to. He took a bit more about the, the tool used at the end of that presentation. That's what you and again all the partitions. If I should say, what do you think of us as useful tools? They can be looking at now as ways of detecting bias my system, you know, particularly because so much of the kind of responsible II Community has been dominated by the largest companies in the world. And I think that's incredible interest from thought of some insect companies. Now that may not have the same resources or the benefits of an economics in that
team. Sandra, do you want me to start with this one, or do you want to let you go first? Because I won't stop if I wanted to. So I would say is that one of the nice things about the work in this area is that there's quite a bit of. It has already been implemented into open source tool kits are being that. You're essentially running the check yourself in and did the pens on the model that you're working with and the type of a, are you working this? But yeah, it's there for self testing and self-assessment to be fair. I think that's a lot of what the new regulation is calling for is
essentially, I hesitate to use the word self-regulation, but at the very least a lot of internal testing around things like bias rather than it being a third-party say a third-party auditor regulator doing it for you. So we mention the Amazon tool kit. I'm your things like asparagus 360 toolkit. I want to say it's IBM's tool kit that I believe is open source as well. I'm turning into Dino, put out a h tool kits with Accenture. I want to say, although I'm not sure if that one's open source of them, think it is essentially there's lots of these
toolkits where it's supposed to be taken up by developments in the organization to do testing, but then, you know, you have things that are less, technical, like algorithmic impact assessment, documentation, standards of things. Were there, more resembling. I say a data protection impact assessment or privacy impact assessment, completely different set of people potentially doing that. Especially I think with a special to buy assisted, we have come up with,
this is not about affirmative action. This is not about, you know, Giving it like up to to somebody because of their skin color or gender or sexual orientation. This is about acknowledging that we have very bad tools of measuring Merit. This is about trying to uncover hidden talent. That is currently not walking through the door. It's because we using very bad proxies from the for example. Did the the number I gave us two grades, Right girls being just as talented as two boys, but they get
worse grades, right? So how can you use this information to create an algorithm that accounts for that? Right? You could say what a bit and be for an eighth much as I pay for, boys like that. That's a very bad measure for, for, for competence and mathematical skills to begin live. Right? So, this is not about opening the doors and pushing somebody fruit, who doesn't understand it. It's acknowledging that we push. People from the door at the moment who don't deserve it. So if you take fairness, seriously, it is a win-win-win
situation. You would actually get better people that I'm more competent than the ones that you currently have, and therefore, make it a fair and equal play Plainfield. I think that's pretty important about Charities,, But if it's about uncovering hidden talent, anything for any business that has to be incredibly compelling, right? Like I said, I had that my story and it just breaks my heart because as a, as a good message, even if you didn't get the best grades and I look back and think my Gold's was a terrible will let you know if. And when you think about
what I love about that framing Sandra's, I think sometimes there is Gotcha, narrative rounds, machine learning company's bats. You know, the point that you guys made the presentation that the status quo is not mutual that way, we're not stopping from some mythical Mutual stopping points and then introducing by a square base. When you think about AI, machine learning more bully outside of this piece around by us. There is an incredible need to build trust and invest been such a dystopian narrative around the field for rightly. So,
you know, too many of the right reasons for companies kind of watching in the space. What would you think I should have used this? Kind of leaving aside by a task? Look at that than the conversation. I don't know. Do you want to do? Should I do this one? I talked about as well. We're talkin about trustworthiness, and I'm going over while the things that we done there. I think one of the key things from me at least though is that I think this applies, both to the
company developer deploy your side and the end-user customers patient side of things is that you have for companies that have a real desire to say, make their way. I'm Warfare or to be transparent or to be accountable there. Crying out for somebody to tell them what those things actually need to practice and how they can do it in a way that black people are trustworthy. And this is why the approach taken is busy at work is not so much about and Screen what is right? Or fair in general because that's
that's impossible and This follows the method of ethics and general where we're not, you have a conversation, we get to the right answer for yourself. And I think the tools that we we are using need to be designed and used in the same way. They need to be seen as a decision support system. Facilitated conversation to give everybody access to the same test the same evidence so you can be having a conversation on an equal playing field rather than you're having this mass of information, asymmetry in terms of State status quo evidence, for example, I'm which
actually marks a lot of those conversations that we have currently. So I at least see it. As in terms of trustworthiness. It's an opportunity to make existing decision-making procedures, more trustworthy by leveling. And what I'm hearing from. You kind of brings me back to what country was dying about. The sometimes, there. Is this desire from something cuz it sure is there a silver bullet for that to be one for a cat? When it's far more about the process is far more about having transparency around Hudson to
taken, which hopefully should help to build Chuck was in his overtime. So we we have a question on the top from Philip Jackson fascinating to look like you, please can you clarify the distinction between ever rights and decision, right? If you are measuring measuring errors in decisions than they had to be the same? Yes, or no. Do you want to do? Just that in the very, the very serve headline. Answer to that is when were thinking about decision rates? That is the proportion of people from different groups, that
receive certain decisions. So I should stay positive decision that negative decision. Where is when we were talking about error rates were considered were interested in the proportion of people in those groups that got the right decision versus wrong. Since the decision the back and they're probably run into a lot of fairness metrics is quite often. The promised on the idea that you're going to collect data about counterfactual worlds that don't exist, nor to know what the right decision was. But yeah, I'll hand over. Yeah, I think that was perfectly somewhat like one is really concerned with
the outcomes. How many people did actually get the loans and what's proportion between certain groups? That's that's a decision. Right? Sarah rice is really bad. Wasn't a false negative. So you learning something you looking at past date there and you looking at us, somebody that I admitted to University and actually did well actually do well at school is that is if it's a false positive and did the same country, the other way, you trying to make sure that you not making more mistakes than used to do, right? But this assumes that you have ground troops that, you know, the decision that you
made in the past, where correct, which you don't have for two reasons because you don't really know why do people didn't get into University. They could be various reasons. They could be reasons as in. They didn't have the grace and we just talked about what grades mean to society. Right? I'm so you don't actually know if you made the right decision and you don't Know how well did people would have done that you did not admit to University in the first place, right? So you pretending to know something about ground truth, when you actually have to be very, very fair and open, and honest
about the limitations of what we do actually know so much more reliable measure is to look at the outcomes, right? How many people actually get on University and if it's not equally across groups, then there might be something wrong with the system. If you don't have any black people in a certain course of University and there might be something wrong with house elections are being made. And if we need to go back to the drawing board, and think about how you selection candidate and again, thinking about very good truck seats for Mary, that can goods in a way where will predict how well she
will be due in the future and that is exactly what we trying to do. Come up with good criteria that actually measure merits. Thank you so much. I'm so sorry. I want to thank you for joining us today, and we're going to be coming back over. Alice.
Buy this talk
Interested in topic “IT & Technology”?
You might be interested in videos from this event
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.