Brandon is a Ruby Architect at Square working on the Frameworks team, defining standards for Ruby across the company. He's an artist who turned programmer who had a crazy idea to teach programming with cartoon lemurs and whimsy.View the profile
About the talk
RailsConf 2019 - The Action Cable Symphony - An Illustrated Musical Adventure by Brandon Weaver
Do you want to know what ActionCable is and how it works, but don't want to build another chat application to learn it? Well buckle up, because we've got a treat for you. You're going to learn with lemurs and classical music.
The ActionCable Symphony is an illustrated and musical talk that will explore how websockets work by using classical music. We'll be using select audience member phones to play it. Learn about ActionCable, websockets, latency concerns, client interfaces, JWT authentication, and more in this once-in-a-lifetime experience.
You haven't lived until you've experienced lemurs playing a symphony orchestra on your phone using Rails.
our story today starts with read the Beamer and his master Scarlett's like what? Started yet. The orchestra is all here in. Who has is ready to conduct? Isn't she why yes read. I believe she is then why aren't we playing because we have a new conductor joining us today where I don't see him. They're out there out there. There's some very interesting looking beavers. Do you think they can play instruments to I'm sure they'll do magnificently. Who is everyone? We're about ready to start here. So if you haven't already let's go
ahead and join in at Symphony. Dev UC Symphony with a Y with taking but I kind of like this better. I'm surprised I got that one honestly, but I can complain I like it actually better. We'll go ahead and get you all there. Give that a second. All right, I should turn on the volume on here. Otherwise, that's not going to do anything is it? Okay, then let's go ahead and have our fun real quick. Okay. And then we're going to hope this works. I have some reasonable amount of confidence because I've cheated I'm not going to say how but I have cheated.
Oh, no, no no, come on. Conference Wi-Fi is fun. We will Endeavour though. So a little bit about what the stock is while this is going on. I submitted this crazy idea to Wales, because I had this dream of conducting a symphony orchestra. This isn't exactly what I had in mind, but it does work. So we're going to see if we can manage to make that happen now, it looks like we're about halfway loaded, on. Which is why we have is today Saint worst case. I'm just not switch over to localhost to make it from there.
I don't want to but we will see come on. You see this is why you're very cautious about doing things through Wi-Fi. Which is why we're going to switch over to look lost. Well, I have a 109 of you are ready. I'm not sure about the rest of them. Come on, you can do it. What's latency. Anyway, wow, how do you do that? Brian's right? Nevermind? I'm using Android. If I explain a few other things we're going to go ahead and cheap and go to local version this.
Oh, yes, you can see exactly what's going on over there. You can also see it's stuck loading. So unfortunately, I'm going to have to deal with this with smaller audiences later. But to give the experience. Anyways, we're going to go ahead and See here. run this run this this guy you see a hundred twenty-nine already. I'm not sure about those last few that actually work. I saw a few more populate there. That will just go ahead and run this off of here.
Because it starts to workout websockets, and it does technically count. Okay, that buffers like 5 Seconds. Do it does work locally. I promise that much. This is why we have contingency plans ladies and gentleman. Okay that we're going to try this again. Of course because my phone locked now, I can't play it. Oh, you're no fun. Okay. Well that's not going to work because my phone unlocked and now it doesn't know about the ready state that the prom with websockets your phone happens to lock bad things happen. We tried we failed we
Endeavor we move on. Welcome to the axon cable Symphony. So who am I that's up in front of you in a Beethoven looking wig and a tuxedo on first of all, it sounded fun. So I did it. Second of all, I used to be an artist named musician and I ended up becoming a programmer through a series of very unfortunate accidents. It started all with someone saying you should try web development. What development is a lot of fun. It's going to be great. Okay, so I get information and
then someone said the horrific words I should have said no to which were how about a back end? And that's how I became a developer and operations person several other things and currently I'm doing that over here. It's Square where I'm in charge of the architecture across the company and setting standards defining things and a lot of things. I'm still learning about quite honestly, but that's enough about that. So what exactly was that that you just witnessed?
It was a Symphony played with an entire audience. Up my computer. Okay. I cried you'll have to forgive me for that. I will show you all later. I promise it works. And it's using rails and action cable and various clutches and Packer to make it work, but it does work. More specifically we were doing Beethoven's 6th pastoral Suite. The funny thing is this song's actual name is Awakening of cheerful feelings on Rival in the countryside. And I thought that was really beautiful
thing. That's what we're going to do today feeling since brand new world of websockets of client latency and various other things. So the more pertinent question here is how something like this Symphony on smartphones is a really really hard task. So let's start with a bit of an overview. So what exactly are we going to be covering here today? But we're going to start by looking at what does it take on the server to make something like this work? How do you actually get clients to mostly behave himself? And
how do we secure this thing which is definitely doing tasks. Latency my personal favorite as we just seen and a finale to finish it up. So now for the first major component, we're going to look at the rail server, and we know it's using action cable. But how does that work? Make music to meet action table just for chat application. What we start with is a thing called a midi file and for those not familiar a midi file is kind of an old-school input output file format that allows play music with a bunch of voices tone fonts
everything else, which I'm not going to force you download on your phone cuz it seems like megabytes in size now. What we do is we convert this to Jason into something. We can actually part because binary files and not very friendly to the front end. So we try and work with something else here. This gives us the ability to get the tracks from the midi and treat them as their own separate entity problem being there's a lot more complexity there like time signatures control changes voices and other things that are very conveniently not a problem and either Beethoven 6 nor
Beethoven's 9th Symphony is I wonder how that happened. So overall it would look a little bit something like this a flow of data from the conductor all the way down to the individual lemurs that have to be playing at work. But we take a look at our conductor here to bring the store first and swing part of action table, which is there's no real hard requirement on only using one cable. We start with conductor cable which allow us to have an administrative interface for sending or receiving high low commands like stop play Go buffer music and work just like a Symphony
conductor conductor channel is in charge of the entire show. So what we do is we break those minis in two separate tracks. We make a channel for each one of the instruments that way you can listen to only the part that makes sense for you. So in this case, we might have a channel for French horns for violas for violins and each know the track as we broadcasted over the channel that's going to go to the associated mini Channel and eventually to the instrument on your phone now or only listen for the parts they need so you don't get the entire Symphony. You just get that one
part. But then we have our last piece the players themselves and when you see your collecting connecting, you don't know how many you're going to get. It looks like right now about 46 to a hundred. There could be any number about me 10 a hundred or have mercy on my bill, please. And he sees players is a distinct persons. We can keep Communications directly back and forth to the clients. But let's take a look at those players. Whatever you first connect your not sure who you are yet because you don't know what song you're playing. You
don't know what's loading and that's a job in conductor to tell people what instrument you should be playing in to sign players to each one's instruments, but they can only do that. Once they know again what song were actually playing this signment uses a super Advanced Ruby algorithm to determine the ideal placement for each one of the players that took weeks and weeks and weeks. Perfect and honestly is my prize piece of coding I've ever done in my career. Very delicately calibrated, but that means very good point of good enough. So anyways,
what's that command processes? We now have instruments on phone switching somebody you seen from Olema that started popping up on your phones. Now once we know the answer of the track, the players can start collecting a buffer of information of notes from the relevant media channels before they had start playing so we can offset that a little bit. I guess the problem is is sitting exactly where I clean the air, but we'll figure it out. So our conductor can send them commands like play or potentially stop in back that information
like what you saw on the dashboard over there. We probably don't want to start a song until everyone of lemurs is ready to go. So if we take a look back at that dashboard over there. We have the various parts in assignment saying that there are roughly say 70 of you connected a hundred and fifty of you that have assignments 105 you ready? So latency information and what instrument exactly you're out there. I believe roughly 10 of each or something like that, but you can keep track of what exactly are you doing here?
Now that all isn't to say we're not still using rest here. There is some endpoints which may make a lot more sense to use for rest than others things. Like what am I available songs? How do I login and other endpoint switch run much the same the thing about what sockets is they don't replace wrestling points. They just augment then they give you an ability to do something more on top of it. So that raises a very good question, which is does it scale and I prepare an extra special little demonstration just for that.
Still got to do this. Okay. Well refresh set then. Come on you. Okay. Well, everything is now decided just want to work today is supposed to be a very funny joke of it actually play a musical scale because I have horrible humor. But yes, very funny very funny. But in all honesty, there are legitimate concerns about scalability and how their scales over a certain number of clients and there's a lot of research being done on this especially with some of the folks sitting here in this audience in any cable working on that and I probably not be a great source of information asked on this in the
moment as his talk is mostly me telling Roku to make my problems go away with auto scaling. So that's glossing over the top of a lot of information here with some of them saved until later like security latency Witcher their entire on own fun little sections. So next up. We have our clients or everything is going on on your phones. Now originally, I'd started with jQuery and prototyping lies behavior and managed to get it working proof-of-concept. The problem was by this point. I'm basically Implement a kind of patch hack version react by making component lichens disease in
what really wrestle application like angular Ember was it didn't quite make sense to do that. And I kind of want to learn reacting with his very convenient excuse to do that. The first step is you've seen is connecting your phone to the interface. And then after that you get instruments assigned in a friendly lemur wanders and help you play music. But most of the actions taken are directly from listening to the composer at conductor. The clients are listening for messages on the player channel that originates from commands sent to the conductor from the administrator with female on
this case and things like a Simon playing music and even keeping clocks in sync. Wichita well and good, but how in the world is that thing actually playing music? Well, it's using his magic a little tool called tone Jazz and synthesizers. So what happens is it listens to an instrumental track and it can get a series of notes that he wants to play now. Somebody's nose are artificially modified on phones because I found out the hard way phones don't have a bass register. They can't play cello or bass or anything else. So there is one time I was doing
this and also the entire bottom section just sitting there trying to figure it out until I'm like what we plug headphones in one Fountain sure enough. It's playing just like normal soap phone speakers cannot handle Bass music a good thing to know if you want to try something like this. But the nice thing about tone JS isn't as a feature Escape timeline of all the notes in events that happen on the sexual sequence problem is it's only good at keeping time locally. You say they never thought that it'd be a good idea to scale is across multiple distributed phone and they were very worried
whenever asked him. Hey, how's it possible to do this to like are you sure kind of I don't know. I'm hoping it works. But that's for another section to talk. So what we do artificially here is we had an offset whatever that time. Is it supposed to start at. But we'll get to that more latency France. But that brings us to our next issue was what happens if a perfectly mischievous lemur decides that they want to interview with a connection or do bad things to it. So security is always an issue and even with websockets, it's still very much present. In the case of this administrative dashboard here
were using devised and it would be rather bad. Someone else could start playing music in orchestral things without the proper authority to do so. No, I'm not really using chemical sessions here were using something else entirely Works a little bit better with things like front ends. And those are called jwt's or Josh or Jason web tokens to take care of sessions and as with everything in technology, they're pros and cons here and as like every conference speaker, I'm going to conveniently highlight the pros gloss over the cons and pretend that is not an issue. So jwt's are
interesting in that their self contained that means no need to worry information on server. The entire session is encoded in the token that also mean they're stateless nature. So we just need to keep track of the session the token already has all those things now if that sounds hideously insecure you good. You have a future in information security. We should talk later joking aside though. That's last item which is the fact that tokens have to be signed. stateless and cell contains So take a look at a token that looks like a really hot mess where it's a
text. It's actually based 64 in coated and if we color that little bit will see that there are actually three distinct sections there. Always have me separated by. Or a little bit more succinctly a BNC they're so what these actually are or the header payload and the signature. So I had her child support information. Like what album was used to sign the stokin and what type of token it actually is the payload which is where a lot of security concerns come in or things like who am I? What am I? What should I be allowed to do and what
permissions do I have to we can basically send over at the entirety of what the client should know how to do and hopefully will do as a result of this but that brings us next section which is signature was basically taking the cryptographic hash of both the header and the payload there and a pending a secret which is why it's actually secure is because on the back end. You're the one sign yet. So hopefully you don't have your real that patient secretly tell lies you have bigger issues worried about So these three things are the entirety what do users should really need to run an
application like this. But that still raise good question. What is famous TVs cleaner decides. We're going to tamper with this and a little bit of fun. Like let's say for instance. We have the classical example that was used for strong parameters. This mischievous little lemur decides. I want to be an administrator. So I'm going to spend my token and tell it that I'm an administrator green cricket and send it back to the server and I'm going to have gotten permission to do whatever in the world. I want to do remember there is that signature there as last
bit, which means as soon as I get to the server that going to get detected real quick. So if we take a look at our tampered token again, and we take a look at the original check some those two don't match so we know that tokens been modified and we know that we need to revoke it. Which does bring us one issues of JWT because it's state list. If you have to revoke a token of smells like that, you have to introduce state to actually stop people from logging in which means that majority of the nice teachers you get kind of went to there. Or basically we're saying you shall not
authenticate. But as with everything they're always holes for JWT. It was denying token from list factors. You still retain some of that on the server and have a goodnight list. But that's not what's goes down. It's open season. You're going to have a very bad day. Now, there's really no such thing as silver belt silver bullet into a lot of frustration later on but the nice thing is that that being stateless. Is that you don't have to verify sessions all the time. You just have to check a cryptographic token as whether or not that's there would have had issues in the past
and operations of Oz and all these things. We're brought down and her Services because that's a single point of failure by by your not log into PlayStation. Now that brings me to prep the most annoying, but certainly one of those fascinating Parts about the stalk, which is latency. So as it turns out getting everything to play on time is actually really hard from this off as we saw a little bit earlier networks are by their very nature and consistent and clocks even more. So every additional step that we add gives another hop with means more time that things going to take.
Zoe trifone X very distinct time and one would think that these things are consistent. We're very least mostly consistent for the most part for every day. You say our chances are you're not going to have a lot of trouble with an approximate hundred millisecond off if you glance your clock and I can be late for a meeting. You're not going to be late for a job. You're not getting late too much of anything if you have that type of resolution. So there's what we can accept there as being pretty much good enough. Set up for music. If it were a second off you here quite loudly and it'd be very
unsettling. So how did we take all these disparate phones and give them an idea of what consistent time is? It's exactly what questions would worry me whenever I propose to talk in the first place and some very hard probably contend with if we're to play the same note against each of those clocks. We saw a richly there. We'd end up with something sounding like three phones at slightly distinct intervals and a sound that sounds a lot more like a spider than actual solid notes. So kind of complicated this Wiggly wobbly timey want to mess that takes a lot of sorting and things be
consistent or quite as a single source of Truth as what time it is something we can trust be a good source. Which is why I was very happy whenever I found out that someone had already done this hard work for me in this thing called time sink. JS. It has options for peer-to-peer and service thinking our case we're using server for now. It takes care of things like kinds of server round trip time deviation and is even nice enough to give us a call back when it that often happens change, which is why your phones were all saying things. Like I'm 20 milliseconds to -800000 milliseconds off the
server. I'm whatever amount of time off the server. So we add a timestamp in point or real server. It only responds. And some hacks, but basically what it's doing the same realm server. What exactly is the time in Pakistan and what that does is it runs cup stations on this on an average SAT for timestamps sit back and forth and says, okay how far off am I this actual clock? Give me a lot more consistent are face to work with and inability to make a clock class that we can use at what's the current time with that offset taken into cows. So
remembering what a previous bladder was we have notes that are overlapping sure but contain some inconsistencies once we use x think we can offset that and get at least a fairly decent approximation of what the current time is. It won't be exact but I'm sure that it's close enough. It'll sound like a mostly cuz he's Orchestra. Now, there are always ways we could make this more exact but that requires atomic clocks a lot of research on things like the Google spanner paper. So in the end we get phone through Instinct, but that brings up a good question about the servers themselves. Which is
if we happen to have load balancing if we haven't have auto-scaling we haven't had all these things all the servers going to have different time stamps, aren't they? That turns out to have a really interesting thing called Auntie P wood Stakes here this on the back end in the case of most server providers like Roku AWS and everyone else. They're running this and GMT to ensure. The clocks are kept mostly just dance which is really an under-appreciated offering but can you imagine what type of cancer be out of sync at the bidding site has a lot of offers on two different items
within a fraction of a second one server happens to be just a couple seconds ahead causing that item to actually go to the person slur but because of servers faster they got the item. It will be chaos that brings us to a very good question about real-time software. What exactly is real time and no, I don't mean like human versions lemurs, but real time out in the real world. So in movies, we noticed the last decade there's a movement from 30 frames per second, which is about 33 milliseconds frame tube at 60 frames per second, which was Matt 16 milliseconds frame,
which I know I found very unsettling whenever that happened. I was trying to figure out what in the world was wrong movies at on the top of that 3D and is very nauseated experience. We can certainly detect this difference now, but as it turns out our minds are very good at filling in the gaps, if you took the Reversed you notice a strawberry clearly, but until you knew 60 frames per second was a standard you probably wouldn't know the difference there. And games, it's all about chasing down lifetime to get more precise experiences. Ideally, we want under 10 milliseconds but 20
seconds probably won't get anything bad happening. Sometimes those few milliseconds though make a big difference between winning and losing game. So a lot of gamers are very understandably annoyed with this. It requires a very precise server to sort this all out and make sure everyone and everything on the screens at least some approximation real-time more the better. And sports real-time reactions are actually a lot slower, but these are for top sprinters in the world and they're still between about 150 + 200 milliseconds into US instantaneous have to be on the receiving end of front race
against one of those people not very fun. But by the time the register a starting signal going off, it takes a fraction of second at least to register. That means go First Symphony Orchestra, you think that it would be a lot closer exact time but as it turns out and perhaps kind of musically, it's really not then consistency is by its very nature what gives us the sound that they have it gives it this nice little Timber a colorfeel. That's why I live Orchestra sounds way it does and a mini sounds very robotic an exact those little inconsistencies in time turn into
character of the song of the orchestra almost like a signature which is why every Orchestra sounds a little bit different and why you get a lot of different feelings from each variant of single song. So why exactly would a conductor be waiting around a giant white stick then up on stage doesn't make much sense. Well, it turns out that people are a lot faster. It's reacting to visual stimulus. And if you happen to be in the audience with a cellphone play in Orchestra. You're probably not going to be able to hear all the other people around you the same with
an orchestra is you can only tell by looking at a central source to say this is what time it is. This is what beat it is, and that's how they know when to play on time because otherwise I can copy of a mess and I know from playing orchestras before you have no idea what's going on and you hope it sounds good audience, so don't worry. It does most the time. Humans aren't great at being exactly on time for anything and often times. They really don't need to be because real time probably isn't necessary. Which brings up a very good points, right when action cables first
being considered? Dhh and said and rightfully so that he can make a polling application for chat application to register. What was the person on the other would have absolutely no idea that it wasn't Happening instantaneously by those little things like person is typing or those little... The coop in various areas. But the point is you really don't need to stretch resolution window. For real time. In this case. You probably only need good enough because each level position you gain it ends up being diminishing returns granted. I can sand be a little bit more precise from
this implementation here, but it does the job. Now I did intend to do a finale here, which was Beethoven's 9th. But unfortunately, it seems like the music is not quite working. So I'm going to have to do this later for whoever would like to hear it. And I do apologize. I will try and see if I can get that working later. But the reasons I chose that song is because of Joy because I don't see a lot of us joy, which is why we're here today with all the experiences we've had and to me that's a very beautiful things. I can stand up here on stage in a tuxedo with a wig with baton with coconut
shells with Lord knows whatever else I have in this bag up here and conduct a symphony orchestra on cell phones because we enjoy that we get joy out of that. We enjoy Wednesday and beauty and all of these things which what really drives me Ruby. So wrap up if you want to find out more about the Lemurs and everywhere they are and where they are going next. Feel free to follow me on anyone social network site Twitter being the one that's inconsistent. There is a very fun story behind that what you can ask me later on, but let's just say I'm not getting VA Weaver on Twitter.
And yes, people still do use RC. I'm actually really nice. And of course when the new Mastodon instances Ruby. Social might have noticed that I played a little game with stickers the forecast for lemurs. And those stickers are back tomorrow. We're going to start having very square inch nearest running around with stickers for stickers hiding all over the place. If you want to find out where they're hiding take a look at lemurs railsconf, and I'll try post pictures of who exactly has which lemur it's your job to find them
though and good luck because some of them are really introverted which makes it very interesting. Now if for some reason you do manage to find all the Lemurs, there is a special prize at the square booth and there's a raffle associated with this. Oh, yes. We actually do have fun things to do this, but I figured it's a lot more fun collecting stickers because I like stickers to do you like stickers? Big black stickers. So as far as credits this type of talk doesn't happen without a lot of people really helping me out. Some of them are sitting here in this room people
who worked on accent table for people work done any cable people who listen to feedback people who listen to me say, hey, I want to play Symphony Orchestra up on stage and saying that sounds awesome said that's completely crazy. What do you want nothing? But it brings a very good point which is the beauty of the Ruby Community is that we get in on stuff together, we build together. We work together we learn together and that's the beautiful thing is that I can reach out and ask any number people for help on these things asking questions get feedback. And you can do that to
just by reaching out to the person next to you introducing yourself saying hi mean who knows it may be someone you end up spending years later down the road with or even working together. That's the beauty of these conferences as you can meet so many different people and that's another part of the joy of Ruby, but I've been prattling on for long enough now, so thank you for your time.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.