My main interests revolve around the application of AI/machine learning techniques to complex business problems. I'm a strong believer in the use of conversational interfaces to help optimize business processes, and in chatbots and intelligent assistants in particular. At National Bank of Canada, I am currently leading the operationalization of AI projects, and I am the Solutions Architect for BNC's conversational AI platform.View the profile
I am a computer scientist (Ph.D.) working on machine learning, Information retrieval and natural language processing (AI). I enjoy using this knowledge to design algorithms, develop and improve complex systems likes search engine, or data analysis tools. I have a deep experience of digital transformation with public companies in banking and media sector with a proven record of introducing high end technology in real life products. I have been involved in management and coordination of various sized R&D oriented development teams, in the context of academic publicly funded research, company-academic partnerships and tech companies from start-up to public ones. This does not stop me to continue to code !View the profile
About the talk
Testing is a crucial enabler for the success of chatbots and virtual assistants. Doing it manually requires enormous time and efforts.
As DevOps and furthermore AIOps grow in importance, automated testing will remain critical to ensure that bots actually do what their designers intend. Unlike traditional software where the application follows a predefined flow, a chatbot runs without any restrictions. Talking to a bot has no barriers.
Combining this with an unpredictable user behavior, it becomes utmost difficult to verify the correctness of conversational AI. Training data and test sets are infinitely large. In fact, quantity plays a major role in quality assurance for bots, but makes it impossible to test manually.
The main questions to answer are "Why are bots failing?", "What and how should you test?" and of course "How to automate?".
We will showcase the setup of a test automation pipeline for a Rasa based chatbot to continuously check conversation flows and NLP performance. And we will take it even further by adding full End-to-End testing from API over Web & Mobile to Voice.
Presented by Dominique Boucher Chief Solutions Architect – AI Factory, National Bank of Canada and Eric Charton Senior AI Director AI, National Bank of Canada at the 2021 Rasa Summit. (https://rasa.com/summit/)
Dominique Boucher is Chief Solutions Architect in the AI Factory at National Bank of Canada where he is responsible for the development of NBC's dialogue system platform from the technical side. His main interests revolve around the application of AI to complex business problems, and in the use of conversational interfaces in particular. Prior to that, he was the CTO of Nu Echo where he led the Omnichannel Innovations Lab. He has been in the speech recognition and conversational AI industry for more than 20 years. He holds a PhD from the University of Montreal.
Eric Charton hold a Master in machine learning applied to voice recognition, and a Ph.D. in machine learning applied to Information extraction and natural language generation. He worked as scientist and research project coordinator in academic context in Europe (University of Avignon) and North America (CRIM, École Polytechnique de Montréal) before becoming head of search engine research and development at Yellow Pages Canada. Since March 2018, he is Senior AI Director at National Bank of Canada.
#conversationalAI #opensource #aichatbot #devops
- Learn more about Rasa: [https://rasa.com](https://www.youtube.com/redirect?even...)
- Rasa documentation: [http://rasa.com/docs](https://www.youtube.com/redirect?even...)
- Join the Rasa Community: [https://forum.rasa.com](https://www.youtube.com/redirect?even...)
- Twitter: [https://twitter.com/Rasa_HQ](https://www.youtube.com/redirect?even...)
- Facebook: [https://www.facebook.com/RasaHQ](https://www.youtube.com/redirect?even...)
- Linkedin: [https://www.linkedin.com/company/rasa](https://www.youtube.com/redirect?even...)
Explain to you all. We are at National Bank of Canada business, line of the bank. Next time. So I'm sure you also National Bank is there is an old institution that goes back to the two centuries ago now from Acquisitions. That is now the deceased of Canada, this is a big company walking in older, the aspect of the the bank bank banking business like retail banking financial markets and this is very important to mention that because what is interested in what you have done is we Implement dialog technology.
Many many years has Tech Bulls facing but also inside the bank for our internal activities. I will go back to the history of what we have done since 2018. Now, we have a long history with, I would like to say that when we begin to walk him on the platform at the bank, we had many occasions to discuss about what you are doing. And I have to mention that because they were very, very spot, every occasion, they have to come meet us in New York on the gym of radar weather in credibly
growing along the time. Thanks to all of them. Reserved that technology for because what we want to put in front is what we do for the customer and the technology we use back. We we do it. I will give you some insight about that to begin with the time line when we have to build a, a group at the bank, around a trailer 2018, we made the review of what was in progress at the bank in Tamil dialogue system. And next week, nine different Technologies and nine, different standard
Deluxe system and most problematic for for, for us what they was no performance evaluation. And then reconsider Unless you want another option, not just speak into it, if you really want to create value, is the, but you need to approach. In Tampa, Fair valuation. We made the proof-of-concept using araiza about the about the previous, and we sure what was possible with this technology. And we decided we continue this and make some experiments in other line of business. At the bank, we
follow you by to remove condoms from my house, to watch some information about legal aspect with made another guy. Always resign with the same technology as to where to buy some equipments experiments. That goes to a some findings and the experiments findings usage of Technology, we walked we had some complications available and we have the February 2020 during the covid. Pandemic. We have the opportunity to put our sad but live, and it was just the beginning of
the of the strategy in November of last year, according to the success, we had with the shot, but we deployed for the the retailer far. We have an extra key presentation and decided to have no strategy Bank wide, that will lead us to a publisher, six more dialogue and drinks this year in Vero stock of the month. Next door in the banking industry. In Canada. We have a lots of activities we have, and it's the time in Canada. There is many North America in Canada banking environment. We are alone
with. We have paper distribution in, the ramp am become friends, we have a very highly qualified team of diagnostician. Is there is the team of Dominique 9 and we have a platform and we have also developed an in-house analytics solution, wonderful life cycle of the of the butt and maintaining the highest level of rubber snakes. Are we speak about that later. Get online and an x-ray. See p.m. facing 9275 croissant meaning more than seven question. And then you say the golden Square? Next.
Truck that do the solution project in Cheaper by two teams everything gravitating around, right? Right. Reza and we can go when one of the business sign of the organization, for example I had to come to us and we use to make for you a dialogue and complete from the beginning to the end. We have to team to support his mind. On this team is in charge when when there is a request to deploy your butt in the bank to do, the business is going on. The other question, requested, always the structure of the of the traffic in
charge of technological innovation, with very, very deep connection with, with Dominique, it's not to sign and it's very important we have in Boston, Scientific profile. So we won't Very close. And there is a I factory team that was because we we find some disputed to Industry industry. Relies old technology that we want to build their role is to integrate the solution and destroy it and Implement new technologies and nourish us to find some solution. The components are available now is inside
a framework, all this is ready to be deployed on the crude architectural channels. We have the reporting system sign. Winston's cat on new dialogue engine. We have all the analytic tools that collect the the the logs. And there we have the capacity to analyze the Lord. Read some dodgeball to report for the business line birth or so we can connect the Lord with the prophet file of a fuser in some contexts to see what's happening inside the bank. So we use all of that to understand what's happening in the in the air. Erica Sturm else, interesting with us
but our showed me previous searches stations and remove some more clothes on the course and pass. And we walk on the next step. We welcome technology to increase the capacity to automate the modern dating. What we want is to be able to collect some new formulation and push them in the motels without any human intervention center next to you. We Implement new channel. We walked our framework kids now and we Walker song. Rubbish. Next technology. Only 12th, we have a life cycle.
With the proof-of-concept, we will test the impact under the customer, and then we go in production, a week to bleach. And the regular everyday during the co visit was 3. 72-hour else. We have the tools to know what's happening inside the dialogue and Jean. What is the satisfaction of the gel? What is missing? And we can improve the motherland, then go again. During the covies craziest we went from 100 question to 324 and 12 and 3 months so we are ready with boots for the
mother and then we follow what the job doing well and we are able to go there very fast. Christmas. Thank you. So I'm going to talk more, I hear about the the it side of the project. So I would have been for about two years in my primary goal was to industrialize. All the chat box that we we plan to go to Detroit the bank. So there really a number of troops that need to be established. First built building, virtual assistants especially in large Enterprises. It's an IT projects. We have the AI component
at first, but the deploy units in the secure. And in the robust environment release in the, IT project itself, which encompasses a large number of aspects in first, you need to do to focus on of course the infrastructure. Because what we want to have is As I said, that a robust infrastructure and instead of components because these new technologies that especially in in bangs which are a bit more conservative. Usually, are we, we, we want to shine at the same
time. We, we want you to have something very stable, so it doesn't appear just to be at the new, a Shiny Toy, doctor AI scientist, or to deploy history, something that they help deflect calls for example, for our contact, centres at the center of this. This whole, this whole goal is, is what we call our operational excellence. And of course, devops is part of the things that we need to focus on. So really, Set a few years ago the the bank and gauged in in od, the devops have been wagging that very seriously. As part of our digital transformation, programme
is a very nice fit for this. Develops cultural being based on the central League open-source Technologies with source code and you know, just things that you put it into virgin control because I already asked the other the assets that you have in your in your shot but should actually be Version Control like gold actions policies and do like all your multimedia resources. We have images of ourselves video was things like that. Of course, all the training data, the
concentrations that the trash holes that are required to wear to perform you were, and then you performance testing. Does goes along with the infrastructure as well. So everything should be version, controlled. And this, again, this fits very nicely with the with the Raza mindset and that. The nice thing about this approach is that, at the end of, this is very nice side effects, for example, having full version control over everything, you can easily audit, the changes that were made
to the chat button is, this is very important to the bank contacts because he has a reputation, is our I say, it's our primary value, we we we definitely need to want to know. Do not have a negative impact on our reputation. So you need to audit everything that will be put in production, and most importantly, especially since the season. And they are you based a kind of product, we need to run experiments all the time to get to improve the performance and see what the, what works,
well, and what works best for for the next to isolation of the chat box. So, having everything on their Source control, this enables those kinds of experiments very easily at which is usually not the case with other kinds of platforms that are in the industry right now. And once we have a good solid foundation for our first project, we can easily develop some templates to to our to the creation of a new projects in charging. So we know the recipe and we apply it to all our future projects. Since we are in the
complete devops mindset. We have course, have our own CI CD pipeline. This is a very simplified version of our pipeline but it's a very high-level. What we do is we have a scanning scanning stuff, that is done, all our coding standards, we will check for from their booties, when we calculate some good quality metrics that we need to add to attain and of course, we try to to detect some bugs or goat smells, then we build it. The doctor images that we will put in production, Here, we assume that the model, the train model is already available somewhere on the storage bucket of some sort,
but that could be also included in the cic disturbed by the training before deploying everything to production. We run a number of tests. So we have unit test, of course, for a d actions, especially in the policies and stuff like that. But we also incorporate into cicd the end of you performance testing. So we we have some some files that I can call. You know, the stress shows that need to be math in order, not to to degrade the performance. So if if the de performances below a
certain threshold, then we just stopped at the end of the pipeline and and we don't deploy in the staging environment, for example. So if that happens, while we have to go back to work tomorrow, I'm bored and see what the what made it there that the pipeline Trail in BD actress in two drops. And so we never put something in production that the does not meet our performance requirement. And finally, we have a rather end-to-end testing. It's a it's a very small, a library that we are developed just to test a very basic and I look close down in a
unit testing framework. So its integrated with bite that stun. It really, it really tests all the the interactions and, you know, the channel. It's also because sometimes we'd be modified a channel that we want to make sure we don't have progressed beyond that. Of course, if we, if we are thinking about devops. Well, we have to wear to consider the UPS a side of it or so, of course, when we need observability at all levels, we need Onestringer chainsaw. We are aware of
what's happening in production and we have to react to in a quick way to everything that happens there. Just to finish search. I will quickly mention a few challenges that are the things we need to ask to look at it in our project, of course, security is are probably biggest challenge and make sure everything is secure, we are bank. So, at again, reputation is our is our biggest asset, the integration of of the web shots at component has usually been by other teams at the bank. So I usually,
this means we have to to work without those teams. They are very different, you know, schedules and we need to plan those Integrations. Well, especially when we make up Grace to those websites components, are you sure that that's the hard part? Developing a chatbot itself is usually I would not say easy, but it's the easiest part it's integrating into Other web App Store applications. That is the most time-consuming part. And finally, we have to resist what we called the White Sox. We are in the very innocent industry and technology. So we have to
resist the urge of putting nice user interfaces in top of those Technologies because they used a change after a very fast rate. And so we have to maintain those blue eyes and it is so it's very time-consuming. So when you really need to to stay at the Dakota Levelland and that's what the results are very effectively. Thank you very much.
Buy this talk
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.