With twenty years of experience in the telecommunications/internet industry, I have a reputation for "doing the impossible." Such notoriety has come through consistently offering solutions to major telecommunications/internet problems, via my vast experience in detecting and resolving technical dilemmas, using cutting-edge resources and equipment. I also have a unique ability to blend new technologies, like Natural Language Processing, Machine Learning, WebRTC in such a way as to create a synergy that provides forward thrust and momentum in today's highly competitive telecommunications/internet environment. Being an "A type" personality, I am a self-starter who works well alone with little or no supervision. I do however enjoy the collegial camaraderie and being part of a team. I am capable, dependable, flexible, and can easily assume any role necessary to the life cycle of a project.View the profile
About the talk
Human-to-human electronic communication has moved from text (email) to voice (VoIP) to augmented video (Zoom/Skype). Similarly, the medium for human-to-machine conversation has moved from text (chatbots) to voice, with voice-enabled chatbots in wide use today. The next step in this evolution is a video-enabled conversational experience. Each medium change brings its own technical challenges. Creating a good voice experience involves more than just hooking up a chatbot to a text-to-speech and speech-to-text service. Vocinity has developed a platform for voice-enabled chatbots that has been in production for almost 2 years. We're updating our platform to support a multimedia experience where the bot communicates via video, voice and text messages and images. Using Rasa to provide the conversational logic for the immersive multimedia bot enables us to meet the challenges in voice/video communication. Rasa’s power and flexibility enabled us to extend it to support voice and video.
Presented by CTO of Vocinity, Nathan Stratton at the 2021 Rasa Summit https://rasa.com/summit/
- Learn more about Rasa: [https://rasa.com](https://www.youtube.com/redirect?even...)
- Rasa documentation: [http://rasa.com/docs](https://www.youtube.com/redirect?even...)
- Join the Rasa Community: [https://forum.rasa.com](https://www.youtube.com/redirect?even...)
- Twitter: [https://twitter.com/Rasa_HQ](https://www.youtube.com/redirect?even...)
- Facebook: [https://www.facebook.com/RasaHQ](https://www.youtube.com/redirect?even...)
- Linkedin: [https://www.linkedin.com/company/rasa](https://www.youtube.com/redirect?even...)
#conversationalAI #cicd #aichatbot
Why is voice and video are increasingly important for customer experience. And I've been working with Ross or now on at the city for three years and previously at broadsoft for two years. So I've been working with rasa, really almost since its Inception. And it's been a great ride so far over the last 5 years. Customers want it now. Customers are looking for immediate answers, they're looking for cool. Demos interactive, content and personalized Journeys. They
are looking for that through host, a medium. But one of the common threads is that they wanted, instant gratification what marketers want is higher conversion rates, they're looking for more engaging experience. Improved Roi in all of their campaigns, both print and digital and then looking for personalized, Journeys and Customer Loyalty, the reality is very different. Budgets are shrinking. Personnel costs are soaring omni-channel experiences are out of sync. Lacking of a video is example as to require typing especially in smartphones. This can be a
problem. This me, a problem with driving in contact with is the new Norm we have a number of customers that our approach And saying we don't want an interface that is touch base anymore. And so this is a real problem for us some of the existing Solutions. A picture is worth a thousand words voice is faster than typing. I can finally conveyed his presentation in a chat talking a lot longer than 20 minutes allows. So you know, that is something that is originality,
natural conversations with voice. Then when I just simply typing voice box also have challenges though, he started as a voice-only platform and we really quickly realized that real-time interactions require engagement that engagement customers. If they're staring at a totem in a grocery store, for example, they need to be able to see something to engage them in that conversation. Users quickly, lose focus. There's a lot of distractions today and the average person remembers just a fraction of what is said to them. This presentation will be up recorded live later
because, you know, you might want to go back in here or something I said. And that is just the reality of a voice I can occasionally sperian in strictly harder to maintain contacts and filter out nonsense with a boys, only a scenario. Today the world is moving to video. If you look at things like Tik-Tok, Instagram, YouTube, they've all demonstrated the value and importance of the video. If you look at the social media networks, like LinkedIn Facebook. They all are moving to streaming options
instead of just a audio or just a static text Anton The power of video is undeniable hears of a bunch of statistics and kind of video that was pulled out from a number of different industry reports. Some of the ones that I really want sushi with audiences. Rather watch a live video from a brand and readable log 82% / open for live video from a brand over social post. These are some things that are changing and are becoming much more. Yo, ubiquitous throughout the user's today.
I think I'm just my own children. They hold their phones like this in front of their faces. Instead of holding them up to either because they're in a video chat with their friends rather than just an audio chat. Real-time video interactions are hard. This can be stated a couple times. Recorded broad expertise. You don't have to understand the Pokey video broadcasting in Weber to see whether to see only recently became a standard across all platforms and browsers. I
like to think I was as an example. It wasn't until iOS 14.3, which just came out a couple of months ago that you could use Google Chrome in webrtc call on an iOS device that now is, ubiquitous, now that we have that with chrome and Safari on iOS and Chrome on Android. We're finally now becoming a standardized on the platforms that we used to be able to provide of a video experience. Digital humans are still evolving things like frames-per-second. You know is is lower today in going
up they tend to require things like gpus in the back end to be able to provide them at a decent frame rates and provide them in a way that looks realistic. I wanted things that, you know, we like to look at looking at digital humans in a video side is what we call the uncanny valley. And as you get closer to representing a live human, being on a screen tends to start creeping, you out until you get over that Valley and get into something that looks realistic enough to fake somebody out in.
So that's a very important in that migration. I need to create management for sites with individual user content for the engagement experiences. What is a video Geisha platform? We have a multilingual video Avatar and then that engagement platform Peter faces first to multimedia. Gateway gateway is the thing that supports never to see sip broadcast Services, grpc, all of the real-time Communications microsite proxy server. Allows us to push content to user in a unique way for them to be able to see on the screen, or
on your site in the voice core platform. This is where we rely on things like Raza on her back. In like said, we've been using Raza since our inception for about 3 years now, and we also have to have things like a s r t, TS, and things like agent Builder, to create platforms and build you an actual Bots that we use. Agent Builder as a touchdown, is a platform that allows us to build a real-time voice and video agents. It's built from scratch internally. We looked
at some of the commercial an open source solutions that are out there. But because of the video requirements and even some of the voice requirements, we chose to build their own rather than you something that's their little eyes. Rich Overland has a video images and under multimedia content in the experience in sports. Multiple types of actions things like transferred pausing API calls in and out on an SMS to chiam. And I follow that wonderful sort of thing. Omni-channel, deployments are
very important today. You know, you can't simply just say we're going to provide this on the website or even rent a kiosk for digital. Signage something like you see behind me is a representation of a told him that we haven't seen retail locations, or QR codes. We find the QR codes is something that is coming back. You know, as people are looking for a contactless way to engage, QR codes is a great way to do that and mobile device with a mobile-first strategy allows you to take a picture of a QR code and then
a from that picture immediately pull up a web RTC link. And that webrtc allows you to start engaging with a voice and video bought through that technology. This is a high-level process flow will review. Our experience can be anything from totems QR codes, even Legacy. Telephone phones PS3 a nor or call you have through QR codes and totems and web browsers the ability to engage microsites for each customer. That type of rich media interactions as well. All of that is fronted by a
multimedia Gateway. The interaction between the voice and video streams, and then the textual representation of what is said, or what needs to be said, on the back end. Vicinity core again, this is where rasa lives and some of our other Technologies. We leverage on Apple open source as well as our own proprietary Technologies in the back end. Interactive rich media experience. This is an example of one of our avatars and a mobile device and share links with SMS or text and we also
provide escalation. So, one of the things that's important for a lot of our customers is what happens if you get stuck in a video of experience with a chatbot and one of the things that we provide is video on queuing that allows customers to have their own video agents in a call center. And if they get stuck on the the call center can then help them out. They can just say, it's a police a agent or even the agent to agent, can determine that they're not able to be helped and hand them off to a
live video agent in a call center. Core, processor architecture, and let me first, say, I apologize for the slide. It was inserted this morning and I did not review it. There is no such thing as a r s that I'm aware of. It should say, sr4 automatic speech recognition. So I apologize for the errors and that's in that slide today. One of the things that we do is TTS cashing out for all of the teachers providers that we have as well as the Avatar voices, which is human played voice in exchange for
instead of a computer-generated voice. In The reason we Implement TTS cashing is because of the delays that we get from our third-party providers like them both in GPS and it becomes very, very important to be able to get that to a user as quick as possible. In my pre running, all of the variance and cashing were able to greatly lower the latency between when a voice says something, and when a response can be said for Maya from an individual user, so, Why is the engagement platform video game
platform? Allows simply a digital twin of your best and brightest in agents, get smarter over time. We're seeing 25 to 200% reduction in cost versus employees or contractors and you know that is that is certainly significant today in and always available, always ready, never get sick. This can function. You know, everybody pretty much on this call knows just how availability with a chat box as well as voice box and video box at any time of the day. Really is able to meet customer expectations. Much more than human agents and diminish covid, liability.
We are starting to see this from a number of our customers. We have customers today who've approached us and said we have limits And how many humans we can have in a store and we need a solution that allows us to still sell these products and answer these questions. But without humans in the store, how can you help us? And so we've been able to work in particular, retail to be able to get a solution that allows them to do this without having extra people in a store, which limits the number of customers that they What industry in application opportunities
from, you know, some of the applications that were doing today on the retail and Healthcare but also things like surveys and you some gay Vision, follow-up, support onboarding, an enrollment there's just a number of different areas that a virtual video Avatar agent can serve that really can make a difference for a business. Technology, approaches and benefits. Again, virtualbox are faster than typing and texting their hands-free, eyes-free better more natural experience. And today, we are human recognition models. I remember when we started three years ago,
a Sr technology was was getting good, but it wasn't at human. And we're now, Acumen level with most speech today, which is really an amazing thing and then the voices that are played back or text and speech are also I'm much more lifelike than they were even just 18 months ago. So that's a really quick, I tried to go through this as quick as I could there, but thank you for your time. And if there's any questions, I'll be happy to take them out.
Buy this talk
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.