Duration 25:35
16+
Play
Video

Natural Language Processing for Healthcare: Current Trends...By Amir Tahmasebi, Director, CodaMetrix

Amir Tahmasebi
Director at Codametrix
  • Video
  • Table of contents
  • Video
Video
Natural Language Processing for Healthcare: Current Trends...By Amir Tahmasebi, Director, CodaMetrix
Available
In cart
Free
Free
Free
Free
Free
Free
Add to favorites
28
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speaker

Amir Tahmasebi
Director at Codametrix

Amir Tahmasebi is the Senior Director of Machine Learning and AI at CODAMETRIX, Boston, MA. Prior to joining CODAMETRIX, Dr. Tahmasebi was a Principal Research Engineer at Disease Management Solutions Business of Philips HealthTech. Dr. Tahmasebi’s research has been focused on patient clinical context extraction and modeling through medical image analysis and Natural Language Processing. Dr. Tahmasebi received his PhD degree in Computer Science from the School of Computing, Queen's University, ON, Canada. He is the recipient of the IEEE Best PhD Thesis award and Tanenbaum Post-doctoral Research Fellowship award. He is a committee member of MICCAI 2020 and has been serving as an industrial Chair for IPCAI conference since 2015. Dr. Tahmasebi has published and presented his work in a number of conferences and journals including NeuroIPS, MICCAI, IPCAI, SPIE, JDI, IEEE TMI, and IEEE TBME. He has also been granted more than 12 patent awards.

View the profile

About the talk

It is estimated that the healthcare industry will produce 2.3 trillion gigabytes of unstructured data in 2020 alone and the increasing trend will continue with a rate of 48% annually [The NY Times]. With recent advancements in Natural Language Processing (NLP) and the introduction of Transformers and specifically BERT language model, there is a high hope for NLP technologies to facilitate and expedite automatic extraction and structuring of healthcare data for decision support, better diagnosis and outcome for the patient while reducing the risk and the cost. Nevertheless, the utilization of the off-the-shelf NLP on healthcare data has yet to prove its effectiveness and scalability. Recently, tremendous efforts have been dedicated by NLP research pioneers to adapt general language NLP techniques for healthcare-specific domain. This talk aims to review some of the current challenges researchers face, and furthermore, reviews some of the most recent success stories and finally, draws a vision for the future.

Share

Does it talk to that? I I felt like I can start the today talk about it's their application of natural language processing Healthcare, but a bra review of what's currently going on in the field and what at least in my perspective the future of an LPN health is going to look like in without any further delay. I would like to get started with this lie. So if you are interested in AI in healthcare and an LP, I believe you're in the right session, so I thought because probably just conference has broader audience

than that. Just specifically NLP. I started a quick introduction about NLP. What is NLP how to understand the human language as well as being able to generate a human-like language and If you've been following in use, it has been tremendous Improvement In The Heat Of The Machine's ability to understand human language. I believe many of us have these devices at home that we used for voice recognition and such as Alexa Google devices improvements on a daily basis are happening in these and that's thanks to the improvements behind the scene.

So I'm trying to give some examples of how old is teaching of the machine to be able to analyze and understand human language in generate can be utilized in different contexts. I'm about dependency parsing example in the top song about understanding the context and summarizing yet as the two examples below. In terms of what is going on. There has been quite a lot of improvement since about 2018 outside and in the MLP domain and this is thanks to improvements in the language while when that happens our proposed by Google initially and then since then has just been booming.

Like if you look at the conferences every year the number of Publications are increasing in the NL pizza, and it seems to be like a Kind of Revolution going on his domain compared to traditional classical NFP what I tell them 20/20 what I can summarize the trends that are going on into four categories one would be on the day tomorrow. So as we know there's lots of data being captured in the form of text being able to structure the state to summarize it fix the action out of it and even run out of ticks on it. Capability in

different fields such as Finance Healthcare and other domains the next I believe all of us can appreciate the improvements. We have seen the language modeling and translation. I know how often you guys use Google Translate the new capabilities in Gmail that receiving the autocomplete like he asked me to type a couple of words suggest the rest of the sentence for you or auto corrections desert-like organically receipts coming into our different applications daylight and it's all like

because of all the improvements. We we are seeing coming into a now if you don't mind the next time I would categorize under customer service. Image of all the websites that I go these days there's a chatbot popping up on the one side of the screen and trying to offer some help. These are kind of cute and quit being the phone would be as a form of mutual assistance and receive more and more of adoption of these Technologies in the customer service the biggest I think musing this would be the acquisition of Mind by Salesforce,

which is a shows how important they recognize the capabilities of LPR bringing to customer service. And finally in the domain of security and surveillance in the social media and in the in the advertisement, there's a lot of in Djibouti is embedded in these are like, I'm referring to Francis Facebook or Twitter. There's a lot of takes data being created but being able to analyze this data for the sake of sentiment analysis for product reviews and Etc. There's there's a lot of opportunities final p.m. We

see a lot of interesting out comes out of it. He helps in filtering applications for instance as all these posts are coming on line. Of course, there's there's requirement for a mechanic them to filter in the ear Evelyn consent. In terms of Technology highlights. I tried to basically in my opinion come up with the top three in 2020, even though the wind direction in career presentation from Transformers, which was initially introduced by Google in 2018. But receives still theirs is this is one of the hottest

topics even in 2020 and I'm sure it's going to continue birth came out in 2018 different flavors of it's coming in 19 and 20 Albert, Roberto Strasburg stillbirth tinyburg efforts from the community in in decreasing the complexity of the the bad language model, but at the same time maintaining the the performance in the middle, you will see that specifically June for special domains efforts were on the bed models of Bible verses for by medical contacts in

I'm sorry, I burped and peacock birds in spanburg. He's all like the malls being proposed by different research teams to provide a better tuning of the general language model for specific applications. And finally the last column summarizes some of the efforts in basic unit in the space of transfer learning. So taking the bread language mall and one for specific languages and Burt is a multilingual and bird language model Beto Spanish Alberto's Italian Rupert Russian. And Camembert is the French one as you can see people having you in fine

with the naming of the first language mouth the next highlights in technology problem you guys heard about the news if you're falling Daniel pgp key three witches from openai a nonprofit organization focused on eating human in their own gaming terms of Technology. What does CPT three dice. Basically, it's a multitasker in LP capability. So as you can see different animations and I'm on the right side. It shows the different purposes for using this technology by

providing a narrative very human-like narrative. You can get today interesting outcomes in terms of cogeneration in terms of basic Construction in your question, and I want to give up corporate answer. So if this is very very very big Improvement compared to the previous version GPT to and if you look at the behind-the-scenes thanks to 175 billions of kilometers to generate this model and which cost about 12 million dollars according to the article. This is if you want to compare to human brain Bring us a hundred trillion synapses. So we're still pretty far from to get to the

human brain level of intelligence. But we are on the right track and every year even though this time cost 12 million in the forecast is like this cause is just going to get lower and lower. So that's the second one and the third one. I was debating on this one. So I consulted my colleagues. I feel like the grass in your network is going to be also the biggest thing you ever see a lot more of it in 2021. What does that mean? It's related to the graph language model. There's a graphic representation. What is a graph of Knowledge Graph representing

a collection of interconnected entities as we know entities around us the all somehow connected to each other in the aspect of spatial or temporal or or other ways and basically thick rap. Usage in the NLP is not you it has been used for different essential part of NLP applications such as translation dependency parsing and Etc in the past, but more recently received option of adding the form of neuron Network a graph neural networks. And as a result something cool men's taking

advantage of this internet connection between Concepts to thrive multitasking capabilities. Having said that I want to start now switching to the health care. So NLP for General domain kind of try to summarize that has been tremendous improvements in the in the different applications. What about NLP? So here I thought I start with the results of a survey which was conducted by greydon slow and it shows the adoption of different NLP technology provided by the

Bay City pioneers of energy technology such as Google Cloud AWS comprehend as your Microsoft or I can Watson and what the basic is the people who come back to say when they ask the basically people from different Industries such as the computers and electronics technology shown with dark blue Healthcare with the light blue and the finance services with the with the green color. I would like you to pay attention. Go to the health care, which is the light blue eyes. You can see the one that has highest in terms of adoption

of this these kind of services within the organization of corresponds to the eye are there other or none. Meaning that none of the Technologies of off-the-shelf would be really come handy in solving Healthcare in a few problems in this is because of many reasons that it's not really as simple as like a general language model a problem such as a general language translation or Q&A I summarized is this kind of differences in a few in few items. I would start with the size of the data that we're dealing with Emi records in a typical size are about

5250 white, but if you The number of Records in in Andover fashion that we are generating the state we're talking about a lot of a lot of sex drug documents requirements in being able to handle this amount of Asia in the healthcare space. Another challenge on the way. Is that the format of the danger if you're familiar with the help you in the Imaging space since adoption of dicom there. There's quite standardization in the format of the data. What are the ticks

side still we're struggling with that standardization of the texture of data is sometimes it comes into hl7 format the fire then you need to format but its structure meaning those still debatable and still quite the research going on. So we up to today. We cannot really save you have a right to read standard way of structuring. Healthcare date in terms of format On the other side, what about the structure of the the text itself? Unfortunately, there's no structure typically visa-free text. We're talking about sometimes they some organizations in in healthcare to

use some templates, but mostly the narratives are kind of on structures, which is what makes it quite challenging on top of that the language itself princess. Let's look at the radiologist 2018 a note are quite telegraphic. They're quite so basically short and describing events in this is because of the nature of the the job like they have to go to a lot of cases as a result. They have they have about 40 seconds to spend per case and they have to describe whatever the Observer blocks the action they recommend in a short amount of time. So it turns out

to be a lot of observations that are not very standard a lot of telegraphic way of mentioning things. That's yet another challenge to add to the to the list. In terms of common tasks of a application of NLP in the healthcare space. I also tried to divide into four things. I start with the the more general of one which is having a dispute that being able to structured and so for the downstream pastor, this would be used for the basic a decision support for the radiologist. Would a physician at cetera you could be for

analytics population Health Management Etc. It can also be used for a common application which would be classification, an example of classification is in form of space and that would be clinical trial matching. So there are lots of descriptions about different trials, but being able to put the station to the right track if she's still a child and Pharma company start to adopt is NLP Technologies to help them to better classify the patient's moving to the next, and cast it would be a Q&A

the doctors Woodloch driving capabilities by using some sort of a narrative language asked a question about the impatient and then the engine behind the scene prostrollo bit about the patient and provide you with an answer and if you want to ask about where did you figure this out from you should be able to show the evidence for it that from this document? I found this answer to your question. GB3 is doing a great job for Dollar General language. Not so sure about the healthcare Healthcare has its own challenges and its own technology. And finally,

I will consider as it also a common test. The prediction type of tasks given all the events about the history of the patient. Can you help me know? What's the the next like even to happen about the fishing a heart attack or another common problem given all the symptoms if you have been seen in the past. What is a typical in LP pipeline for health application such as the one I described to ask in a free text and sticky ties? Let's avoid the problem of OCR at the moment for the sake of example, I'm

taking this snippet from hepatology notes which describes the specimen location and some attributes about the about that specimen in this case about the process biopsies. The location is left prostate and the Gleason score which is the way of grading the cancer is described as you can see. So what I want to get out of this is being able to structure this information and where is it? What is it that I'm getting the specimen from? And what is the great or the type of the cancer? So did they first that typically start with pre-processing many people under estimate the impact of this

pre-processing on the downstream cast of this Example The Space Between the numbers. If you don't take care of it what is going to end up it avoids this hole and he to be captured together. So pre-processing. Hopefully it will take care of that. What's the next step would be sentence parsing the meaning of break my heart my document into segments that are kind of separate from each other at least in terms of sentence boundaries. Once I have that the next next tax that would be talkin ization mean to breaking down the sentence into individual words. And

then from there we move to the finalists Downstream machine learning AI task in this case would be structuring to the note for me. And if I don't do firesticks properly, I'm just going to make this machine. That's much much much harder than deal with a lot more noise. There's a lot of capabilities out there off-the-shelf tools for doing this kind of common tasks, but I don't believe you're there yet. And as a result, I want to show some examples. Let's take a note from mimic to you which is a papa given the belt ecology

common tool used for sentence parsing at off-the-shelf Spacey for General language works. Great. Let's see. How does it perform on a on a basically clinical document as you can see the person didn't do so, well, even like for instance East lines of underscores continues underscores, every basically service has been detected as a separate sentence. Let's look at the end of TK another common tool this whole header part has been segmented as one sentence. And

finally in 2020 stanza from Stanford NLP came out which looks very promising and it has even an NLP related to healthcare chewning embedded into it, which looks like from our first test providing impressive results. So it's very important to pay attention to these steps and we choose the right tool and if the right to doesn't exist, that means we have to build it from scratch and it took Innovation I have similar example is a snippet of a pathology notes and you can see that Spacey

the extension number that is being tokenize is broken down two pieces, which breaks down the whole structure of it. So it's going to be down the line harder to deal with the date has been broken down in a wrong way and it's a stanza does it much better job again as we can see so the tuning is pretty good. But again, Let's see if I would like to extract a Gleason score as a 3 + 4 stands as a capability also not sufficient for that. So I need to add something on top of it. I put a couple of examples that we have been working on a person security metrics

for named entity recognition. This is basically a snippet of a radiology note and the goal is to extract and and Concepts such as diagnosis testing and treatment as you can see here they're being detected and if you do it with the processing and into consideration in sentence parsing bounced in caste is going to be much easier to be successful that another example would be a classification tests such as anatomical labeling and basically for the same note try to assign anatomical label pair of sentences into that space

and find it in terms of the prediction test. I chose computer-aided assistive coding. That means assigning to procedure and diagnosis codes given the textual context about the patients. I'm stoked are lots of players today in the development of energy for healthcare. Your name is IBM Watson health is your basic medical company in from Amazon Etc. So it's very important we deal with the correct selection of the tools for application and make sure that it's

doing the job expected from that to a typical way of adoption of these Technologies as you start with a pre-trained model and what do you use you bring your own application dates are this is an updated label data or Hospital specific data, you do a fine tuning of this retrain model. I'm afraid to find you tomorrow. Then you start putting into application. You bring out a new note and you realize the performance is not there and what happens if you go back and start adding some rules in there in this situation in that case don't listen to the find your mouth. Come do this is no

problem with that the way we should look at this is that's okay, but we have to also take advantage of the information that you're Gathering as is updated model. So sample your output do some QA with the experts in the domain and use that knowledge to improve your rules as well as update your data and then do the fine-tuning again and I'll send you a picture in model. This is referred to as active learning continues to learn. What is the future going to look like for healthcare in my perspective the scaling and generalization? Is that the biggest thing to basically

address a lot of these days we are using and dealing with are lacking scaling and generalization and the domain adaptation is really a no-brainer for healthcare by the same time. That doesn't mean the complexity and the footprint of our model which is also the same time watch for reducing this complexity. So it's it's affordable for all basically users. Not just the ones that they can afford train engine model with Kindred that hundreds of billions of hamsters. Another interesting future for health your specific need Federated learning you start with the

universal model and then fine-tuned recently difference between organization and this is also interesting because you avoid the transfer of sensitive data outside of the organization for that unit and for the training purpose. And what I would see as morning to the Future, so that was immediate future is sometimes we create problems and then try to come up with a solution for so did the radiologist States and now it's whatever you see on an image is an example of state and then we run in LP a built-in ability to structure that data or why do we need to go through that? What

what's the point of this is to create useful information for the position for the next time the patient comes to Reno and longitudinal we can track the data. What we could do is like basic and immediate action turn into a structure built around and creative narrative and then struggle to build an NLP work on that nasty. So this is the way that I see the future should look like basically immediately going from observations and findings to actions and structured data to avoid he's like intermediate problems that we struggled to solve. But I would like to finish

with the word of caution specifically building capabilities in the eye and a pea in the healthcare space the way we look at these is typically measure of performance in terms of true positive false positive false negatives, but that's great for applications. If you're detecting cats and dogs, right because if you make a mistake if I say dog is a cat is not the end of the world but in health care if you miss if you make a false negative or false positive positive you're talking about lives. So if there's a screening program and because of an LP, we didn't

detect the symptoms correctly and that person doesn't go on the screening program and that cancer develops. That's what matters is. I would like everyone to pay attention to the true meaning of false negatives and positives if you're working the space. So with that I would like to close my talk and that I would be happy.

Cackle comments for the website

Buy this talk

Access to the talk “Natural Language Processing for Healthcare: Current Trends...By Amir Tahmasebi, Director, CodaMetrix”
Available
In cart
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “Global Artificial Intelligence Virtual Conference”
Available
In cart
Free
Free
Free
Free
Free
Free
Ticket

Similar talks

Sanji Fernando
VP Technologies at Optum
+ 4 speakers
Matthew DiDonato
Senior Staff Product Manager - Artificial Intelligence at GE Healthcare
+ 4 speakers
Nataraj Dasgupta
VP Technologies at Rxdatascience Inc.
+ 4 speakers
Slava Akmaev
Chief Technology Officer at Scipher Medicine
+ 4 speakers
Shahidul Mannan
Head Of Data Engineering & Innovation at Mass General Brigham (Partners Healthcare)
+ 4 speakers
Available
In cart
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “Natural Language Processing for Healthcare: Current Trends...By Amir Tahmasebi, Director, CodaMetrix”
Available
In cart
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
561 conferences
22100 speakers
8257 hours of content