Duration 25:48
16+
Play
Video

Building the Future of Voice | Rasa Summit 2021

Emily Lonetto
Head of Growth at Voiceflow
  • Video
  • Table of contents
  • Video
Rasa Summit 2021
February 10, 2021, Online, USA
Rasa Summit 2021
Request Q&A
Video
Building the Future of Voice | Rasa Summit 2021
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
284
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About speaker

Emily Lonetto
Head of Growth at Voiceflow

I'm a data and design-driven problem solver that's fallen in love with SaaS. I take companies looking to scale, and discover new channels and opportunities to explore. I'm a serial experimenter that adds value by building and optimizing for user growth and retention.

View the profile

About the talk

With every new interface comes new considerations, constraints, and opportunities – voice and conversation design are no different. Expedited in many ways by the increase of consumer platform adoption, COVID, and burgeoning business use-cases – consumer expectations and business opportunities are rising faster than ever. In this talk, I will break down the foundations of designing and developing for the new world of voice. I will compare and identify the constraints and contexts in which businesses/professionals are thriving and what challenges are naturally tied with channels built on conversational AI.

We'll dive into the unique challenges, constraints, and opportunities tied to the conversation design and voice ecosystem – and hope to outline the possibilities of what voice can unlock in the future.

Presented by VoiceFlow Head of Growth, Emily Lonetto at the 2021 Rasa Summit (https://rasa.com/summit/).

- Learn more about Rasa: [https://rasa.com​](https://www.youtube.com/redirect?even...​)

- Rasa documentation: [http://rasa.com/docs​](https://www.youtube.com/redirect?even...​)

- Join the Rasa Community: [https://forum.rasa.com​](https://www.youtube.com/redirect?even...​)

- Twitter: [https://twitter.com/Rasa_HQ​](https://www.youtube.com/redirect?even...​)

- Facebook: [https://www.facebook.com/RasaHQ​](https://www.youtube.com/redirect?even...​)

- Linkedin: [https://www.linkedin.com/company/rasa​](https://www.youtube.com/redirect?even...​)

#conversationalai #aichatbots #nlp

Share

So, thanks everyone. I as I mentioned and Emily, but that's not what you want to hear about today. You want to hear about what is coming up voice and this new interface. So today as you know, this is me, there's my social handle at the ends and if you have any questions, I know things are running a little behind. So feel free to take me to be more than happy to chat with you there. If you haven't heard a voice mail yet, we are an easy way to design develop and launch voice conversation experiences anywhere whether it's Alexa, Google, or any other

conversation assistant. And today, we're here to kind of share a bunch of the insides that we have from work with a bunch of people here and some of the best that we have place for the future, but before we get into this, let's make sure we're on the same page. So conversation design conversation, design as a whole, is this new interface that's been around for decades, but it's really kind of coming to the Forefront of what a lot of us in this conversation. Today, we are thinking about and more importantly, coming to fruition into big ways, open chat, and voice, and even in the intermix

of those other platforms and multi-modality. But today, I really want to focus in on voice. Pacifically voice is changing the way that we designed fella and interact with technology. It has really come to fruition and the consumer Market from Siri to Alexa, to Google magenta, Hannah, and so many more. But it's also changed the way that were thinking about things thought, assistance interactions, and the things that they're now allowing users to do or interact with are challenging enormous for companionship, communication and contextual

And it's changing the way that we interact or changing in the way that we expect experiences to work. So from running our home to running our lives to running and discovering new things and even the way that we teach kids or communicate or get more information sent to us, But with every new tech comes new responsibilities and Argo also challenges and with voice, it's definitely one of those that we're currently in that stage of battling because for a bunch of different reasons. Not just the fact

that it's new but also the biases and the new things that come with that interface because today we're not just a building for today's use cases. In fact, we're reimagining what are new experiences need to look like from. What does that car experience look like to? Perhaps, what does our return look like into the real world where we have touchscreens, or elevators are interfaces that no longer fit in with, in our social norms or reimagining, older Logistics from telefonia experiences to drive through experiences Logistics and more And the

second part of this is that we're also not always building for ourselves in this use case. I'm we're also trying to think about users, that may have been left behind with some of aagosh. I'm getting cash delivered right now so I'm going to move one moment please. Yeah course in the middle of my presentation sorry. It was like a very live experience. Hope you guys all enjoy the door of my house. Anyways, we are also a building for in a lot of cases and I've been left behind by other consumer Technologies. So you have the older generations of that

to even just the elderly and how they communicate, how they work through that. And also voice is really changing the accessibility side of things as well, allowing people to really engage more or take advantage of technology that they may not have been able to use in a GUI interface. And the big thing to note here is that beyond all of these things voice isn't that complicated. It's just something that we taught ourselves to forget and a lot of that is due to interface like this, the keyboard touch technology and our ability to really

introduce the interface of or the interface options of self-serve technology, where we have a generation want, instant feedback. We want our ability to be on demands to self-serve, or to choose opportunities were perhaps we as the user are controlling that and it's our own time. And because of this we train ourselves out of a lot of the more human approach to answering questions. Because instead of asking research and the beauty of voice in conversation design is that it's really bent on this concept of asking on creating that opportunity to answer

those questions. I'm in a quote that I think it's really wonderful that rain produced in one of their one of their white papers. A few years ago, is that is also providing a democracy isolation movement where users unlike an interface has. There's much less of a barrier for us to be able to go in because we naturally learn how to speak, how to communicate. And all we need is a microphone and speaker Internet connect to really participate in this movement. So, for example, you put a laptop in a room and only the one that types can control the platform and its information,

whereas you put a voice assistant in a classroom and suddenly everyone has a remote. And that's really what leads us into the opportunity of voice today, we're we're still despite all the progress that we've made in the desktop era, free Google Voice. And what I mean by that is it's dominated in a lot of cases by technologies that are plugged into a wall situated. Part of your house and there's a? In terms of discoverability but because of that, there's still a lot of room to innovate and grow that space. And while most of us,

I'm sure especially at this conference today would love to do anything to expect the growth invoice. Adoption, it's already growing faster than any interface before it and it's clear, when you put it side-by-side on the interface, adoptions at smartphone, TVs internet, and it's clear in this case that the consumer Trends are going to also push the boundaries of what businesses will have to keep up with. And also the challenges that will have instead a new expectations for the platforms and while we've seen the spike in user adoption in Boise, a lot about just come from the

speaker market and why isn't this is important as developers and designers today is because like every interface understanding the natural biases that come in. Is also something that we need to handle as a challenge and we as users are biased by the interfaces of our past and us being users and also the designers for the space. It's incredibly important to get out of that perspective or sinks through what else is possible. Because if something looks like a speaker, it's very natural that the first things that we try to do with it, our speaker relay, which is why about fifty or

sixty percent of new users on these devices. Mainly use them for music or asking transactional, questions, one-liners or entertainment. But beyond all this and I really want to keep this in the back of your mind when you're thinking about it is that voice is still the most human of interfaces. And what I mean by this is not that the voices themselves under humans because you're far away from that just yet. But what I mean by that is the way that we attract them. The fact that we learn more or they learn more if I ask by repeating in the same way that our kids

do that, we do in the way that we share information with each other. And over the first few years, we seen a lot of that. Not just in how much smarter the devices of God or the expectations and how we as users have learned to ask more but also in the way that the platforms themselves have changed and it's not just this binary or linear experience but it's now become a little bit warmer set up a little bit more challenging and for some also visual. So we're seeing this expansion of an inter mixing of Bose speakers and no screams

and even more importantly around this, now, going into that third phase of multimodality of creating a balance in between where two speakers play and where does screenplay and we're different platforms or context passing in between those devices become part of that experience design. And it's the brands and the people that are working in this space, we're thinking Beyond just the individual interaction and thinking more Contacts or a journey across different platforms that are really making an impact in the Ford right now. Allowing us especially as users and designers

developers to extend far beyond the confines of just Alexa. What's the weather? Because users will learn by asking and while we can promise them. And while we can make suggestions, it's this action that we need to really build back into the habits of the new users again, and the newer generation are more familiar with this are growing up with this the same way that we did with keyboard and typing and taking the classes in school and we is designers and developers need to learn to predict an answer those ahead of time building or looking for better. CMS

modeling's like things like jargon or better. N l u n o p like Raza to better understand and process information. And this new functionality design context, multi-modal capabilities are driving Innovation forward in the space allowing us as designers and developers teams, or even just obvious to really. Now think about not just choosing a platform on sticking with it or choosing when design and running it through linearly buttons. Thinking about them. As moving pieces, where, what is the best mix of things with their platform. Whether it's the actual

understanding, the back and the front time to create that best conversation experience. Just like, we have our own prefer text. With gooey and for conversation is on, it looks a lot like the design tool that you choose to the end of the UN lucky that you plug in a power. It with to the platform that you been launched and use its face for. And I like to think about this and just kind of goes back into that concept of it being very human platform on how do you handle those things? How do you choose the right mix and just the same way that you think about your friends, or each one

of them have their own Base information, around background their own advantages. It's the same when you think about what platform you choose to build on our launch on what plugins that you choose in order to enhance it or perhaps, even what Customer experience will better handle what you're trying to accomplish because much like, you may not ask in this example, Jessica, who's the best at trivia for health answers. You might want to choose a Natalie and I'll pee that plugs into your designs that use a special or is specialized in the industry that you're looking for or platform. That's

better suited for the natural bias of what you're trying to solve. Because each natural advantage in each nouno, pecan, enhancer upgrade parts of your experience and building today will require this. Contextual thinking for example, balancing when is the right time to decide your voice just as much as why? So, for example, you get into a fight with your significant other, which medium do you choose? Well, both calling and texting or talking maybe all valid. There's going to be mediums that have different natural advantages or disadvantages.

So the same goes for text conversation, design versus boys and choosing a medium and when will also impact the success of that same way with conveying a story or conveying a thought, those different mediums have natural advantages and disadvantages. So speak you may have access to more emphasis or more context in Emotion. In the same way with voice, you have a 6 ml to be able to really add that emphasis. But with type, you may have to be limited to decorations or maybe different pauses in different ways that you can communicate that. But it

might be faster. So, we choose voice in a lot of cases, because it's a fast input. It's a fast way of us asking for what we want over looking for, but it's yet to be perfect. And what I mean by that is that even though Floyd such a great fast input for things, it's not always the best outfit and it's also good to know where you can play into multi-modality even with him putting. So for instance, password probably shouldn't be said aloud or things that hard case sensitive or let's say things that need to be turned into text-to-speech. There's there's different things that

you might want, but there's also other ways were they can intermix. So, take a look at Netflix. For example, most of us have spent a lot of time here. I could during quarantine, but when we look at things here, we're ten boys play in is inherently visual. But when you go into the search functionality, when we hit our Smart TV, we all know, or if seeing this like death box, Trying to type in something that is much faster in order to stay or you kind of already, you might even see it on the screen. So here's an example of where boys could really enhance that experience and you're

seeing more and more of that as people. Adopt more into smart TV, technology where you may know exactly what you want here but you just don't know where to find it. So that input is going to be a much faster way. Getting to that answer and then selecting it and moving on to a visual And that kind of leaves into the building for the future, what kind of context or situations should we be? Considering? Should we may be challenging and in the future, we really see it as Voices of Faith. Endless amount of dynamic outputs. So that's

the invitations. Starting that event could be somewhere in the middle to enhance that it could be logistically checking in on things. Really kind of playing into what time you make platforms. The user has in their everyday life, much like how it you're in the Apple ecosystem you potentially check your messages on your computer. In the same way you took them on your phone and you want those things to be. So let's imagine some things that we do on an everyday basis now and how they could be enhanced. So for example, you order an Uber via voice and you get updated when it's closed in, perhaps

people at your phone, when it's closer to find where it is, or follow along on a map, but let's take a look and think about how we do. Right now you have to pull out your phone, you have to find the app on your endless screen of apps. Right now you plug into it, typed the address Austin, your location is wrong somewhere, close to being wrong and then you wait, whereas if you think about let's say, where do you want to go or perhaps you even just going back to a safe location. It's really easy to say, order me an Uber to work and especially given the context. That many of

these smart devices are still plugged into our home. It's a great Avenue to already have that information pull, that understands the user wants to go to another state location or perhaps you and put that on your phone. And then what they really just need to know is when to go outside, And now you look at even what I am. More excited about is macros. So for instance, let's say if you're on post quarantine you're ready to get back out there. You want to do some states, want to go out. You want to see concert? You want to do them with a fun reservation somewhere and probably get there. Let's

talk about in this case let's say maybe you say master I want a book or get me two tickets to see the weekend. They know maybe in this case they can ask OK are is a sedate? You could say yes as his would you like me to pre-schedule, an Uber, or would you like to suggest restaurants around that area? All of these things are taking contacts that it knows where that location of the concert and knows the time of it based on that ticket. And it knows the ref radius in the address or bar area that you would like to stay within for open table. So even creating these

macros in the same way, we are shortcuts, or we have things that are my prisoner keyboard or other. And then, even in the way that we handle Dynamic at slots and variables, we look at the simplest type of order that all of us are familiar with ordering pizza. So you order as a user, a large pepperoni pizza. And we know from that sentence that it is a large, the sizes, they're the type pepperoni and the item itself or complete, but we know that address and time is still need it. So, in the future, be able to figure out, especially now with a lot of the design

platforms, like, boys, Flo or like the ones, the other ones that are in the space, like jovo xcetera, but there are ways to order and set in these Dynamic automation. So that your system knows when to ask, for missing variables to complete that it. So, being able to also incredibly important And you also need to consider Beyond just kind of the when in the Y, but also the wear. And that's what's really interesting about things as well, unlike with the computer, where even if it's a laptop and you are somewhere in a cafe

versus at home, you still know what's contained in this device that you have there, but in this situation with conversation design, it could be happening in the car, it's be happening outside. Clean. Your airpods could be in your living room and understanding where that user is can also help to complete missing pieces of yours and you're seeing that not only has like the initial invitation creating that a time, so ordering a pizza in the situation. But also a update where friends since is in the screenshot of me actually ordering Dominos because I just moved clearly

and it giving me an example of Alexa as a way to keep my update. And then you also think about, if you are a multi-product platform where, which one of your products or which one of your experience is better for the stadium. So, Ubers a good example of one that's really going to. Like that didn't do nearly as well as Domino's which had explosive growth. But why perhaps it would have been easier to log Super 8. And when you look at Uber in the user Behavior there, that's because back then we're outside and we're places. And when you look at the vast majority of

the people that order ebers, it was to go home, not from home. So when they would actually interact with its skull, it wasn't a large amount of time versus what who breathes, where the vast majority of people who order were ordering it too, is that location off in their home versus elsewhere. So that speaker match to the context of where they were expecting that experience. And lastly, you need to consider what external situations can impact your conversation design. So, in the same way that I mention that there's many contacts, many other things to consider with voice, you can consider

the time the sound radius, the setting and how many users are impacting on this, because it changes from it's just night time, you don't want to be allowed to. It's in the office for the kitchen or you might want to suggest productivity suggestions versus recipes or maybe it's an outdoor experience, its immersive like Mars bought, which suggests things as you're walking around in your airpods. So when we think about the future, I think last week's important that we think from the past so learning about what happened, the other interfaces will also help us better explore the one that

we have today that we think about this is that a platform shift happens once every few years and we see that with wood design tools going from traditionally desktop to Cloud to collaborative cloud. And now with interface shifts, that was happen. However, once in a career or life time for some and we see that when paper went into the web and the web, at one point, in the early stages look, like, basically brochures to them, then just hiding and figuring out that there's things you can do on here that you could never do with paper like instant messaging. And then now

moving on to a mobile which eventually at one point looks like just a shrunken down version of the internet to apps that are solely available and solely possible Duty mobile applications. And that's really what's exciting about this. And that's what I like to call the road to interface animation. We're in the most cases when he's done with your faces. Come you first? See a flood of early adopters play around, think we start to figure out what can I do? How can I popularize this? What's exciting about this new thing to then playing around with basic utility? What are one line

command? Simple, actions, simple things that we can accomplish entertainment, which is where the vast majority of adoption comes on when people are like, this is fun. Let me see what's possible. But, really where that utility and as we've seen with mobile or as who sings internet passport, it is so much more than entertainment and well, entertained, as a great way for user adoption and consumerism. And as you can see, from the spirit, which is the most popular thing on my phone after when it first launched, see you today. Where we've extended Beyond entertainment, and I'm now looking at

custom utility, how can we use her create custom interactions? Awesome. Things are customize their devices there? Salt in the same way that you look at your MacBook adoption perhaps. Or when you first got your first one you're probably like this great. Apple Genius has told me to use Pages Pages versus now where you probably get one and you would we really go in and tell load all of your upgrades, all of your changes into it to you and now where it gets really exciting. Is that last stage highlighted here. We just platform-specific & Beyond so things that are now possible solely

because of that platform, or things that are built, because that platform can do better. And we seen that really come to fruition of the last year's specially on a lot of our mobile devices. So when you are thinking about this and learning from the past stretch Beyond kind of what are the potentially be platform-specific and also think through what could potentially be there, if it were a seamless multi-modal experience in the same way that all your Apple devices, your Android device to speak to each other, how can this be another input for that or how can intermix with that

experience for users? Because chains like this doesn't just happen and why a lot of us are still in the Mechanical, Turk phase of things are Wizard of Oz aging and testing like that. Now there's still Lots that's getting done to make that easier and it's our job to really challenge that. So the beauty of it being still early or still in the stage where there's a lot of innovation happening is that we're rich with opportunities to explore, and you can start by thinking Set the user behavior that you have today. How can you rematch in them? You can think about how they will work into a

world where boys is that interface or where does it intermex? What are some of the repetitive tasks that you have thought of differently? And then also, now, think about even more recently, what are some situations that won't be the same as we re-enter the world postcode bed? And how can you streamline your work to find your standards, or potentially even work together because that's also a huge phase and how we innovate and grow more. So while there are still many questions that need answering what you build today will help shape that progress of tomorrow in the past, your own brain

and there's lots of sharing and learning to be done. So what will you build? And how can I help? And thanks everyone for joining and seeing a tour of my couch covers?

Cackle comments for the website

Buy this talk

Access to the talk “Building the Future of Voice | Rasa Summit 2021”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “Rasa Summit 2021”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “Artificial Intelligence and Machine Learning”?

You might be interested in videos from this event

October 7 - 20, 2020
Online, Mountain View
19
5.41 K
google, googledevs, it, machinelearning, mlsummit, network, platform, tensorflow, tfx

Similar talks

Alex Weidauer
Co-Founder & CEO at Rasa
+ 4 speakers
Casey Phillips
Sr Product Manager, Conversational AI & Messaging at Intuit
+ 4 speakers
Vineet Malhotra
Partner, Digital Ventures and Alpha Labs at Mercer
+ 4 speakers
Dennis Yang
Lead Product Manager, Conversational AI at Dashbot
+ 4 speakers
Sweta Patel
Sr. Director Product Transformation at Juniper Networks
+ 4 speakers
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Hans Van Dam
Co-founder and CEO at Conversation Design Institute
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video

Access to the talk “Building the Future of Voice | Rasa Summit 2021”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
735 conferences
30224 speakers
11293 hours of content