About the talk
Ethics & Society: Improving the trustworthiness of AI systems in health and care
Curated with: NHSX
Numerous tools have been developed to audit and assess algorithms, which could be used in health and care to improve the trustworthiness of AI systems. This session will look at how this could provide technical fixes to many problems that arise from deploying an AI system, as well as discuss which tools are particularly promising, and the benefits and limitations of adopting these tools more widely.
Brent Mittelstadt - Senior Research Fellow - Oxford Internet Institute
Lizzie Barclay - Medical Director - Aidence
Mavis Machirori - Senior Researcher - Justice and Equalities - Ada Lovelace Institute
Brhmie Balaram - Head of AI Research & Ethics - NHSX (Moderator)
Brent Mittelstadt is a Senior Research Fellow in data ethics at the Oxford Internet Institute, University of Oxford, as well as a Turing Fellow and member of the Data Ethics Group at the Alan Turing Institute, and a member of the UK National Statistician’s Data Ethics Advisory Committee. He is an philosopher focusing on ethical auditing, interpretability, and governance of complex algorithmic systems. Brent also coordinates the Governance of Emerging Technologies (GET) research programme at the OII, which investigates ethical, legal, and technical aspects of AI, machine learning, and other emerging technologies.View the profile
Medical Director in AI-driven healthtech for >2 years. Driven by the potential that technology has to improve patient outcomes and reduce health inequalities. I think clinicians have an important role to play in the healthtech (AI) sector to influence and nurture a responsible, trustworthy culture that emulates the core values of the healthcare profession. Medical background: UK-trained doctor with 10 years experience of training and working in the NHS including 2 years in Clinical Radiology. Previous clinical research/audit focus areas: chest radiology, early lung cancer diagnostics.View the profile
Define session is improving the trustworthiness of AI systems in Health and Care. And we have moderate that. He was head of the welcome Barney. I Love You song, says, looking forward to that into everyone else's have a great session with Berkeley medical director Asian, which is a company that specializes in a powered applications for oncology. And also this year with Mavis much Aurora, senior researcher and Justice, and equality Society. So our discussion stable, Focus On Tools person, develop to audit and
fixes for problems, sitting and explore the best. Its limitations of adopting schools are open today in, fryland tools is on charge of the trust me. I see if we can Thanks for me. Everybody expects revising, the tunnel Event, Austin window, how I feel and which I haven't been doing Africa. Bolivia is trust, trust. In a generally, the public have some understanding of their level of training that we've gone through. I'm all professional creative process, that is expected from doctors, write themselves that they respect.
And of course, that they have a high level of medical knowledge and keep up-to-date medical knowledge increases and clinical guidelines change for a week and provide the best possible care for a patient. So when it comes to you, I wasn't asking a clinician to trust a technology to make potentially make the clinical decision for them or influence the decision without a full explanation of how the algorithm came to that decision. So, I think it's a big ass, that's what doesn't surprise
me that there's still quite a bit of reluctance or at least has sat overnight oats clinician by using an unfortunately many clinicians have had previous episode of negative experiences of using a Ohio. Having trust in the quality of the eye itself, the algorithms in the performance, whether they need the expectations and the sons of Kylie wants to provide the NHS and other Healthcare Systems. This will save the quality of the service that they received from the company. Providing the eye of the time. People say to me, what happens if that has
something goes wrong with a? I got a hold of your company. How quickly will you respond how we resolved? The problem. So I think as well as the tryst in the air itself provides did I develop relationships with clinicians are able to have faith and you are able to provide insurance. Yeah, so I mean, fussa fool. It's crazy because the panel and I agree with them. What lives already mentioned about getting permission to trust the system before this understanding what the definition
for even using, what we mean by AI. Because if we don't have a destination in which everyone expects clinicians, to be able to explain, what does the temperature in two people there? Those two issues around the different quality that is used to build these AI tools. And we've had a lot of insufficient y'all to about lack of diversity in the missing data. And so is thinking about when we think about what is happening to that date of how old is the day today?
An amazing tool that is built with the most amazing Say tap, but actually, if that date doesn't sit well, with the way people experience, Healthcare it as a as a whole system for the whole system, and people aren't going to trust that the technology or the system itself. And in this, we know that they are very recent examples. For example, is happening in the state's. I'm about 2 years ago, returned by a group led by of my entire hours, which is looking at the way I was used to classify risk and four people in America, but I actually didn't take the historical in
inequalities into account and created a system that existed. I'm so I think when was thinking about the challenges, I'm in terms of adopting a r, i could think about it should be on, just what day is, but the social structures. In a pot, you know, what date is use? How is the device in the date accounted for and does the United South and the resulting technology actually map with what people expect. Yeah, I think that's a shame really helpful point and I will come on to it later
on but I think it's a good thing to note is tool can't address things like how we account for Diverse Health disease or like some of the qualities in the system and potentially replicating not through the development of, but that are currently being developed to improve health and care. So, Brent, can we start with you? It's great to hear more about what that there. Yeah, absolutely. I'm very happy to talk about some other things that exist that will hopefully improve the situation. So there's a big lots of different types of tools and
interventions that in general are being developed more for a, what's a sector neutral approach to not necessarily specifically for Health and Care? Are there more tools that would be developed generically and that would need to be applied in the house and Care space. But I like to think of three different categories of tools in general. So, you have what I would call documentation, standards and impact assessments of these would be things like data, sheets for data science model cards for model reporting, dataset nutrition labels, basically some type of documentation that should be filled
out by the developers of AI systems or the people collecting and curating, say, training data sets that, you know how the data was collected. What devices are within his potential legal or ethical considerations, these sorts of things. Baystate, the standard ISO documentation and similarly, you have Al Green Mix impact assessments, unexpected impact assessment state of protection impact assessments that are meant to be applied by organizations developing using a similar things and is a good alignment there between the requirements of the forthcoming. Artificial
intelligence act from the EU and the documentation login requirements. They have these sorts of methods. The second category would be fairness and bias test and there's many, many types of fairness and buys testing that won't go over them and a detail. But, you know, we have at least 30 different ways of measuring fairness and statistical terms. We have many different ways of measuring representativeness, and bias and training data. And seduction day though. I'm an only thing is, is if you're interested in particular in the fairness metrics, there's a session at 12 cuz I'll be
involved in brawls. I'll delve into much more detail rather than I'm here. The third category, which I think is very promising would be different methods for transparency and explainability. Where say you're producing explanations of how a eye systems work or how they behave at a global level at a local level that can be designed for end users for customers for expert users, practitioners conditions, that sort of thing. I'm depending on who the system is meant to serve. I think I a particular challenge will see you in relation to help him care. Will be to make sure that when you're using
those methods that they are actually capturing the existing biases and the gaps in terms of representation DC in patient data in access to care cuz it could be very, very easy to replicate those problems. Are not question the status quo. I'm and that's really what these tools are meant to do in order to enhance the trustworthiness of both the system and the environment that's being used in. Yeah, I need a friend. Yes. I was just going to add to what I meant to say. I in tongues of the impact assessments. It's a great area that is developing quite quickly. And I know
that. Then it just happened to the age of Ice-T been working on some of these tools. Like, one of the things of engine is on the chest Imaging and thinking in that as a research project around, or what does Paris look like? And how can we think about assessments, right from the beginning of this? What that's coming from algorithm much. For instance. They could do some recommendations around Justice and fairness. I'm thinking about you and there's already know.
Government also has an AI and invested a lot that's happening this we know that whatever we do need to be on all the Rhythm decision-making based on wealth. The agent and a nasty person dated a year ago and all of these underlying theme is the transparent in everything that you do. So I'm thinking about impact assessment to order the tools that were using need to be available. And I think that's great. We're going on within that. And this is just to add to watch. Thanks my best. So I think
it'd be great to actually ask her talking about some of the challenges with limitations of its matching schools, or intervention. Yeah, absolutely. I hinted that one of the key challenges. I thought existed for these these tools which is in terms of Representative Nelson. Again, just questioning the status quo of of how health care is delivered, both in the UK and indifferent Health Care Systems, and I think we can think in terms of challenges that relate to say, the technical robustness or accuracy of these different methods. If I'm giving you
an explanation, for example, of how a how your case was decided, or how a model works. I need to make sure that it's actually correct. That it's delivered to you. At a level that is comprehensive, that is as useful to you. They can actually understand based on your, your understanding of the system in your understanding a message. And that's, that's a difficult challenge when you are using, say one tool to provide explanations or to disclose information. Designs to different sorts of audiences. So we really do need to think of, you know, it's not just
one set of tools that were going to be implemented for every system. It's more that you may need different tools. You may need say call Black's donations fractional, practitioners local!, A specific explanations for friends, just because those are those are the things he must program in terms of what those audiences that they would actually need the general challenges that you see with it. But I think there's also some specific and gaps we see in regulation. So there's plenty of gas that in regulation around a I stay in the
GDP are the requirements that exist there for transparency around all day and decision-making. But even in forthcoming regulations of the artificial intelligence act. For example, it's being put out by the by the EU, then maybe I could step forward in terms of getting trustworthy. Schewels actually in practice and being used in a good way, but there's a real lack of specific requirements in that in that regulatory framework specific standards around. How bias In fairness would be tested for in general. The standards will be set
by developers are by industry rather, than by Regulators or or say end users. There's no specific requirements to give user explanation. So, but actually you as the patient at least, as far as the artificial intelligence is concerned, any level of information or Jennifer? Think you might human oversight, that's a big thing. It's being called for in the defran work, but it's only called for for high-risk systems, and then technically feasible to do. So, using really complex black box systems, or we don't understand how they work. It may not be the
case, but actually the disc framework gets us any closer to a trustworthy system. Because it's not actually constraining how the system is designed as just saying if possible, please, please make sure human oversight is, is in place. So yeah, lots of challenges regulatory technical actually used a space. How we how we Implement these tools in organizations that will require a lot of work for the next 2 years. That's really helpful. Overview friend. And yeah, I think when it comes to regulation,
we're stream. I love them in the NHS. I think I'm sort of seeing it from a different perspective. Now. What I realize is that, you know, as a company, we have so many forms of a nice way of putting it. Seem like that. It take books, exercise is my company, but I think I was even even clinicians when outside of inside or just because you go to see my uncle FBI doesn't mean I trust you. I know that you have to like do this compliance and textbook to take out the
actual trustworthiness in looking at by us and all these things. And that's the way that we designs the photos from the beginning and so it's about them parts of society as well as just individuals. So, I think my concern is, if there's more more regulations, and most, I'm just coming out that are being designed by without the inclusion of industry, and it would have started the table to help design these these tools, then all these great tools will be out there by in from industry to actually Implement them. I
mean, if it does become a legal thing in a regulation ping of course and it has to happen, but if they if you know, I think I think the best way we can do it is vice of collaborate saying that she got all the relevant people involved, including industry should have stopped at the table, deciding what needs to happen. What's realistic, never, it will be time and money. Do we have to spend on these things? So how do you sort of incense of ice and then you met Tech investors on the decision maker? Electric companies to actually use these tools
and inclusion of Industry. When we think about how to make the opposite really meaningful. Yeah, I think of course a really good point Lizzie like you can't be entirely talk down regulation or like stuff in there needs to be some sort of meeting in the middle of a forest in order to stop by and I wonder if we can catch up work. Are you under? So as I mentioned and and this really is a great thing to try and start to understand the issues that I needed to be dealt with in building and it'll be a great speech. OK
Google, play some of the look that way. Jim at AJ is looking at ways to support how I'm making an inspection and orgies of a I know you came by the house and we're trying to buy some of this work. I'm in European step step in some countries that has created a heart, register sites in France, in Helsinki. Starbucks in the UK and we will put you to work with data kind. I'm trying to stand like I said, idea what to even mean by chance cuz I think understanding what we're talkin about. We'll check with Stuart
building that, you know, we're more than we need to figure out who else needs to be at the table when were talking about you know at 8. So that's where the AI and it's beyond a yes to bring an industry. We had a great work truck last week and containing people from different disciplines because the questions that we ask adjusting to be a little bit a little bit. Listen to Dream House with and bring in the expectations and about AI about the use of data into this
conversation. And the final thing that we're also doing is thinking about how may I and the tickets sell for Elsa in her child and social inequality? So we're leaving on a project with the house foundation trying to understand it and how this impacts, you know, not yesterday. I society as well. And I think looking at this as a as a whole ecosystem, rather than as one thing trustworthiness only in one particular area needs to be within the whole system.
Yeah, absolutely. Yeah, just say something about some of the work I've already done and then the next thing that I'll be doing. So I've done a fair amount of work around developing difference or accountability. Your trustworthiness tools that are roughly fall in the category as I mentioned earlier. So I've helped develop counterfactual exclamations, that's based a user-friendly method of explaining the behavior black box systems. It's been implemented by number company. So it's in Google Cloud. Now, for example, about a phone HSBC who have used it.
Like it's a sort of thing where you really do have to look at how it's implemented in an organization to make sure it's being used in a, in a robust way, in an honest way. I'm in a way that's ultimately useful. Either for the users, are the practitioners that are or getting these explanations. I've also done work in the space of fairness and bias, testing. So I developed a metric called conditional Democratic disparity. I'm the lines up with the requirements of non-discrimination law, both in the UK, and the EU, and that's been implemented by Amazon. So it's available to all customers with
Amazon web services, and likewise has been a bit more in the 12:00 session developed this notion of bias preservation, which basically says when were using, or when, when were thinking about things like fairness and buy some space. Today, I again, according to non-discrimination log certain ways of thinking about it certain ways of measuring, it will be more or less acceptable or a line. I'm more or less with the aims of the law in the sense that really we should be questioning the status quo and we should be using metrics that will actually capture
save devices that exist in society and try to improve them rather than just taking it as a neutral, starting point. So that's what I've done already and in the future in the near future. I'll have a project coming. Looks like with anxious X and with ayden's, I'm looking at how we do trustworthiness, all the things in the space of a Healthcare in and scientific research really were thinking of trustworthiness in that project as not just really not a characteristic of the system itself, but more feature of the environment and the people that are using it and something that needs
to be maintained over the life cycle of a system rather than just being designed for before the system is actually being used and really the questions that will be answering in that project will be and we have all these different accountability and trustworthiness tools. There's is far too many checklist impact assessments and fairness metrics and all these things that have been developed. But we're really lacking is real-world evidence of their effectiveness. We don't know if they actually make a difference in practice, as that's essentially what we'll be doing in the project will be
looking at, you know, how effective are these things? How can they be used in a way? That is both good for the Customer for the patient from the end-user but also be no effective and is bought into with within an organization. That's a really exciting project. Looking forward to that one. Thanks been losing it be great. If you have other things to try and get involved in the definition to be filed, with the NHS, is evident through an evaluation of the actual clinical impact and impact on health economics
of advice and think about you a to buy you an expense external funding has allowed us to do that. So hopefully that will help with the trust because of me. We need evidence evidence based medicine. So you think that would be able to do so intentionally a bit more than all control, and I've been on education. So I think I'm I'm ready. Can you dance to ask any questions? I want steak scientist, cuz I work with him everyday. I have thought ability to question. The status quo, question
things, find out why why things work in the way of what we're trying to do. Now is actually served a patient and public that might be impacted by the software. So we started this opening, the Blackhawks series time to do some educational information. Steeple angel either. Missions or patient and public peace with the ring, on our, use the training and standards of what we should be, what information we should be providing through than Jesus. What level of detail they need to understand about that, the AI that they using violence in Wyoming. Them and making them think they need to take
science degree, degree machine learning, but it's provide them enough information to empower them and give them confidence in using the tool and they will know that I isn't perfect. And that's why they need to recognize when may be rejected its results as opposed to accepting the results. I'm not something I'm really trying to work on. With our current news, isn't getting feedback from now. What more information they need to know to give them more confidence in using AI?
Yeah, I'm really excited to hear. My final question today is for everyone is so what more could I think the great to hear from everyone about? We have to think about how a i is impacting on core clinical relationships and core clinical interactions. The AI systems that seem to be then people seem to be talking about the most right now in the healthcare space to me, or are trying to do one of two things, either. They're trying to automate some of them. Routine or labor-intensive, tasks, the doctors need to
do, but she knows it's fantastic. Or they are trying to somehow be embedded in the patient interaction, in the sense of almost like a chatbot type system or a triage system. And its those ones in particular. I think we really need to be paying attention to how that is changing. The role of the doctor. I think you would be hard-pressed to find too many doctors that would be against the idea of automating. Some of their more routine or labor and 10 tasks that are very repetitive. But when it comes to actually interacting
with the patient, that's really where y'all a lot of medical ethics is concerned and where we could start to, I think see, some, some harmful effects emerge from these systems and in the same way that if you call up for customer service to a company and you get a chat bother, you get automated system, the very If we become frustrated with him feel like you're not being taken seriously. And I think we need to be very careful to make sure we don't get any effects that are anywhere near that. That that's cool of effects would say, I'm in the space of healthcare. I'd like to see a lot more
attention being paid to the impact of AI on the doctor-patient relationship. They got a fair point. Yeah, so I think I'm going to go out to the rolls. I think, I might just a little day. I logged in resetting, the song dancing in Spanish long. The quality of that. We allow it. But I think I'm saying to use a training, standardized. Make sure that I actually sends me an algorithm difference Imaging and diseases that I'm looking for, and they will explain the AI in touch with different ways to me. And they do the training in different
ways. I can see myself getting confused about what the intended use of this AI algorithm. The limitations. Full muscle wastage of training, can be standardized from there to say, quickly become into using different eye out for the patients. So I think that would be my main thing. I think it's is where I think they could help. And finally made it. So I think me is too many things one, is that the people who state has used to train? The AI should also be the people who benefits from
a. I and a lot of the times you might find that the way we connect data from groups of people, but actually, in Denver than the system's people, don't penicillin the same music thinking about, well, you know, which areas are emojis paintballs out and the area's most in need of something that Brad said about the impact actually and visit the impact on the economy and the people that we are asking to make use of his dad, for the people that I'm sure that we recognize
that kind of interaction between the social The technical. And whether we're using playing since today is a medical device to listen to. We need to be explicit about who has responsibility and accountability in in kind of a sinus thing on the back of what's been happening with the conversations around data. And what do you use to create a? I also know who is using who is creating the system, which Partnerships that NHS is choosing to pursue. A lot of the trustworthiness of the lack of trust comes from this kind of gray area of a fuller, understanding how these Partnerships the
creative and so I should be if he has Tendencies if we have time to decide who comes into the data or something about the NHS as a brand. I just found two but how is creating these Partnerships that I've been busy? Yeah, I think that's a really important points about transparency and we do have the new center for better collaboration and hopefully they'll be able to shed some light on. That. Might be a good one for you. If the boat where it's a new user could be a potential
worker and the question is around a position with no expert knowledge on the specific disease. Yeah, great question. I think I got his box. My point about. I really think it's the responsibility of the, the, the providers of the two. Provide. Good quality explanations of the device and the algorithm that goes from, you never need to explain what they took. The train down to earth, and how it's being tested. The pre-markets for the post-market was working. The way it's expected within your local hospital trust
and I think we can I connect to you that I'm not a scientist or scientists, but I think I I no understand. So the information we provide to clinicians and other Healthcare professionals who will be using that are responsible for using that device and we need to find out What's the right balance of information that we give you say that you do feel comforted in using it? Whether you have a lots of really important things to think of something so far. I see that the people that I engage with an issue,
a will be there as early adopters keen on his, making sure we make it just as accessible on understandable for the people that have no previous interests. So, we really need to find that balance. I do think we have a responsibility is US, healthcare industry to maybe do that in collaboration with the NHL. I think that's a really great answer. And, yes, definitely. I feel like this is something that I've definitely want to work on in class after thank you for sharing them with us before to working with you all.
Buy this talk
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.