Instructor for beginner-level web application development20 years professional software developmentSpeaker at technical conferences throughout the yearAuthorView the profile
About the talk
RailsConf 2019 - Modern Cryptography for the Absolute Beginner by Jeffrey Cohen
Modern life depends on cryptography. Did you get cash from an ATM this week? Buy something online? Or pick up a prescription? A cryptographic algorithm was needed to make it happen!
Increasingly, developers need to become familiar with the essentials of encryption. But MD5, bcrypt, DES, AES, SSL, digital signatures, public keys - what are they for, and why do we care?
Armed with only a vanilla Rails application and beginner-level Ruby code, this talk will demonstrate the key ideas in modern cryptography. We will also take a peek ahead to quantum computing and its implications on cryptography.
Alright, welcome everyone. Thank you for making your way all the way to the far side of the hall over here. My name is Jeffrey Cohen. I know we're getting started a little bit late. I'll do my best to still end about 11:30. So maybe that 4 questions if I'm right up against the clock. I'll hang out here for as long as you guys want to take questions afterwards. I might be a little bit easier than trying to extend the time of the secession so we can all get to lunch. Welcome to rails, I think this is my fifth railsconf. Those of yous is
your first time welcome and I hope that this conference is really meaningful for you as it's been for me over the years. I've been working with real since the beginning about 2006. Ackerly consult on projects that come under some kind of Regulation. So HIPAA PCI And I also work with companies on building mentorship and apprenticeship programs. One of the common topics that has come up in both of those. Endeavors has been questions about cryptography and it was also very new to me. I'm not a mathematics kind of person. I got
into programming without a computer science degree and realize that I also was interested in kind of learning the basics. So this is. Toby up beginner level talk. I'm just going to tell you a story about my path to how I started to unlock in my mind. how the most common uses of Photography crop up in every day program in especially in in Ruby programming so that's what this is and if you feel 5 minutes in it. It's not for you. Then that's totally cool. You said there's a lot of other Pizza since 2, but
hopefully this will be helpful. Show me to talk I'm going to end up talking about how public key cryptography works. But without the mathematics those of you interested in some of the Gory details on the math happy to talk about that immediately afterwards. I'm going to try to keep the math somewhat light for the talk. 1586 Mary Queen of Scots was found to be plotting against Queen Elizabeth. She was sending in ciphered messages to her co-conspirators. The messages were intercepted and she did not live much longer. This is a sort of the cheat sheet
of the from the folks who that we're working out that Cipher that was discovered. Ann. So for very long time the idea of keeping things secret has been of Paramount importance to governments and individuals and for a long time it kind of stayed the same you basically tried to make up some secret way. To communicate and hope that it wouldn't be figured out and things pretty much stayed the same way for thousands of years until encryption became mechanized.
A lot of you probably recognize this machine. This is the Enigma machine probably the most famous example of cryptographic machinery. eventually broken by a famous British mathematician Who happened to invent the notion of a general purpose computer along the way you guys know who I'm talking about? We just got it, right. so 93 when you depend on cryptography for everything don't recently we all got these credit cards with the little chip in it. Like wait a minute. How does that shift work that actually better than the magnetic stripe that we've been using the rules of
Photography and the advancements in computer Rising cryptography is what's now enabling our modern society is hard to start to think about what would happen if we didn't have these abilities. So let me go back a little bit. The history of what we're really talking about here. So formally we think about it in this way, by the way this chart and a few other slides. I want a credit a book by Simon Singh if you're interested in this topic at all. There's a book called The Code Book 1999. That
is fantastic and taught me a lot about what I'm going to be talking about today but a ciphers which is what we use right now letter-by-letter encoding is only a very specific branch of cryptography and that's part of Photography because that's almost that is what we use everyday. I also want to say upfront photography is not the same as security. You can be pretty good and encrypting and decrypting your data. And yet not be secured security is a whole other umbrella topic. There's some other great talks this conference about security. I'm just focusing on that very
small piece of it, which is cryptography and there are two primary use cases for cryptography verification and secrecy this verification is surprise me. I just thought it was for secrecy, but actually the verification is important to so let me start with that one and they're actually 2 Sub cases for verification. message tampering and authorship So let me start with message tampering. This is basically asking the question. How can we verify that a message was transmitted and it didn't change along the way sometimes you're the word tampering and we think
someone's intentionally messing with the data, but that's not the entire situation those of us who started programming back when there were dial-up modems had to worry about just the bits getting mixed up along the way as they came down the phone line. Also, just a quick history lesson here. So if let me just take the number one internet used today, which is looking for cats. Apparently if I were his Google for the word cat, we all know that an aspie
we translate inside into binary. I'm sure you could all come up with the ones and zeros that represent that word. So imagine the letters CAT or trying to come down. The wire to you, how do you know that you are actually receiving the message that was intended? Maybe they meant to spend the word fat or rat and yet you got cat how do you know that? That's correct. And so an early system was those hey, we do one way we could. Verify is something called parody. And so here's an example of even parity the what you do is in each
bite. You make sure that the number of ones in that fight that there's an even number of ones in that fight. So in the first by 32 Cedar would normally be 3 we use that leftmost bit the most significant fit when will flip that as needed. So that the number of fish in that fight. There's always an even number. Now you had to free agree with Weber sending the data whether they were using even or odd parity. So if they were using odd, then it would look kind of like that. And so this was one way to verify that the data to use that you
downloaded or getting matched. But of course you can easily see this doesn't really handle all the cases there other letters where you would have an even number of bits in the bite and you would think that if that's correct, even though it's not so it wasn't perfect but it was it was an attempt to think about how can we verify that the data coming is the date of that was intended and that struggle remains true for me in my story and my past to how to understand. Other ideas of cryptography it actually started with the notion of check digits.
In the late 60s cash registers look like this and you would be buying your apples and they would ring it up manually eventually the store owner said, you know sometimes our Wicked we are making mistakes when we enter the price. Also, there's no good way to keep track of the inventory just by entering the price. What time do you enter a product code instead sewable enter product code 1234 apples problem is easy to make a mistake and enter 124. So they said I know it will do will add another digit. To somehow
verify that the first three digits were correct. That's kind of a weird idea. But you might imagine. Okay. What if we just add up the first three digits? So the product code is 123 will make that 6 so the real probably go to b1236. So now if I type in 124 It's caught. But if I type in 321, it's not caught. So you actually have to think about doing more than just summing up the numbers and you end up with today's craziness ammoniacal formula for doing UPC codes. There's a check digit at the very end.
And I've listed the algorithm there, but basically you go through you just do mathematics. You take a remainder. You might have to even manipulate the remainder number and that's your check digit. So join the conference that you can pick up any cocaine or anything's got a barcode on it and it probably will follow the stalker with them. And in fact, there's a lot of words now, but actually just me and check digit. This was my aha moment. We were talking about hashes or digest or fingerprints. They're just text moms against somebody of content so you can take somebody
content with her. It's a barcode. Or an entire novel and create a check digit that's unique to that particular content. When the rails developers are probably familiar with his bcrypt. So here's the password that I use on all of my banking sites just for an example and you'll get some decrypt. I wanted it always wondered. What is it doing? Like how in the world did it come up with that? But the main thing to know is that reversing that is impossible. That's why we use a one-way hash if I showed you that barcode and I gave you the
check digit and I said now tell me the other 11 digits in Newpark, could you be like on man? That's not fair those same thing with the secret I give you that what was the password you like? I don't know that's impossible. And that's the whole idea of using a hat. So for security where you you just need to check that some incoming content is correct, but you don't know what that content was. That's why it where a one-way hash. Is helpful. Let's talk about symmetric encryption suppose. I have simple message. I want to send to you I think of some super elaborate scheme in which I
encoded so that my friends can't decode it. Like I'm just going to advance all the letters by one this so-called Caesar Cipher been around for a while as you can tell a very long time that algorithm that we used to do the encryption Advanced by one the the fancy word for that is a key. All right. So in this case, I just have a key happens to be Advanced by one and the good thing in this case is that is reversible so that whoever I'm sending my message to can figure out what I was originally intending.
And so you may have heard of sonography some algorithm such as Jazz are tripled as or AAS 128 or 256. GloFish look there's a whole gamut Wikipedia is your friend. If you really want to know all about of symmetric encryption. And often we need to use the metric encryption you need to be able to decrypt things. Of course do not use symmetric encryption for passwords because you might think while I need to know what the password is so that when they login I can verify that what they say pin is correct that's
actually a bad idea because if you're ever accused of letting your passwords be stolen you can say no. No, it's okay. I've done this with Patrick encryption and they at least would say, oh who can decrypt it and you could say don't worry only me. And you're in big trouble so you don't want to be in that cuz I hit that she what is the one-way hash for passwords, but for cases where you do want to be able to decrypt which is very often you want to use the metric and crew. But there's a problem which
Mary Queen of Scots face, which was how do I transmit the key? How do I let the receiver know but my eyes were the minions otherwise, they won't know how to decrypt it. This was the case for literally thousands of years until the early 70s. What's a mathematical breakthroughs? Really did The Impossible and our modern society is now based on what we call public-key cryptography. The idea here is we're going to use two keys. each key transforms data But the special thing is that we call them a pair. because sort of math and magically they can reverse the effect of the
other. so another aha moment for me was that with these two keys one key is arbitrarily selected to be the public key and he simply keep the other one as a private key. I always thought there was some magical thing you run ssh-keygen like GitHub tells you too and okay gave me a private key in a public key. I guess there was some real reason behind that. Pretty much an arbitrary distinction. I'll show you in a minute. So when you have two keys. I want to encrypt something. I picked a key. And it will transform into something. Almost impossible
to decipher. In fact, I know that the only way to decipher it is with the other key of that pair. If I don't have the other key of that pair, I know that this is going to be Unbreakable. Until Quantum Computing which will save for the end of the course. Once I encrypted it I can decrypt it by simply selecting the other key both of them transform data. They just so happen to exactly undo the effects of the other but it doesn't matter which one you start with. It doesn't matter which one you call public and private. That's
an arbitrary decision. So here's an example. Two friends ones out hiking the others dragging you want to meet up for lunch. So Mister a wants to send a message to mr. B. Using that brand new cool Smartwatch while he's hiking you can just type in a couple things and somehow everything gets encrypted and sent over. Hey, I'll meet you at noon for lunch your friend Miss tray. So here's how we would use those two keys to do the securely. So the first thing is and this is the step that I always messed and I did not understand what was going on until I figure it out this stuff.
Dismiss for a wants to send a message to mr. Be secretly. Mr. A uses Mister B's public key. I always thought this we had to use this raised keys or something. So I think he's that I've generated I should use to encrypt not true. I use mr. B's public key. So, mr. B's going to receive some encrypted message. Who can decrypt that? Are you all hopefully by now, it's pretty obvious that mr. B can decrypt a message with mr. B's private she writes. Mr. B's pair. are mathematically linked and they exactly undo the effect of the other. So if I want
this to be to be able to decrypt I need to use mr. B's public key to encrypt and that's why one is public and one is private. By the way. I've also gotten question sometimes like how do I guard the public key and you don't? Okay, so it's public for a reason. It's okay. Everybody has it. That's all right. So in this case right before a needs that public key and and that's how we can securely send messages. Without be deciding on an algorithm without having
to beforehand transmit some teeth. next use case is authenticity so long time ago. Way that we knew that the orders to the Army at the front came from the King was there was physically a wax seal that would go on the scroll of the paper. But this wasn't just decorative. That design of that seal was unique. So the king of England say or the Queen of England had a very unique steel that could be recognized reproducing that was nearly impossible and it was only in one place which is on the ring which in theory
was on the hand of the queen or the king. So if you receive something that you can verify and not been opened and had that particular design you knew it had to come from that hand. That's how we would verify that something was othentic. But it's one thing to receive a message. Hey meet me at noon for lunch signed Mister a but maybe it's not really mr. A they just Mr. Z. No sew going not only what the message says, but to verify that it came from who we said it came from
is super important. If I go to amazon.com to buy some books. But I'm not really at amazon.com. I'm giving my money to someone else. How do I know? How do I know when I go to google.com? It's really Google. They're all sorts of DNS in networking tricks that I I don't I'm not a networking expert that I know could be played up on me that could prevent a page that can make it look like it's one of those other sites even with that domain name. How do we know most of our friends and family may
not realize you can click that lock in the browser and you'll get what they call certificate. and that's supposed to prove the authenticity of the publisher of that site and But you can click through and actually see the cryptographic hash that's in there. And the way that that works is almost exactly what I'm about to describe here the same way that mr. And mr. B can verify that Mr. A was actually the author go something like this after receiving that same message from Issa Rae. Sorry, just before
Miss Ray sends that message out. This rate is also going to calculate a Digest. on that continent Okay, so Using something like md5 or or is one of the Sha Sha 128 Route 256 Rochelle one shot to family of digest like we use forget. Whatever you want to use is pretty good in the end you calculate that digests instead of a single-digit like we had with the UPC code. It'll be maybe 16 characters for 32 characters. It's a fixed-length regardless of how large your original content was. So you calculate your digest using one of these
hashing algorithms. And you then in Crypt that digests number? With your private key. Okay. This is act. This is the equivalent of the wax feel you taking something. That's private that ring that's on your finger. Nobody else has. Are we taking that the digest that is the check digit to the thing that we used to know that the original content has not been tampered with? We're going to encrypt just to digest with the private key. Everything else has been encrypted
with Mister B's public key, but the digest we encrypt differently. What time is the college for digital fingerprint or somebody just called part of the certificate? It's really just a hash that's been encrypted with the private key. Okay, so now I'm Mister B receives. The message as well as the digestive been encrypted the dive the the message part. We already talked about that can be decrypted with mr. B's private key. But how do you verify that the wax seal that's coming to you is from the person that you think it is. Again, actually pretty straightforward mr. B
candy crisp that. using Mister A's public key because Mister a used Their private key to encrypt the digest. Why would we do that? Because the public he was available to everyone can decrypt that wax seals with everyone can verify who it came from. Okay, so then having decrypted that hash. Mister be independently calculates the hash of the content that was received. So you decrypt I content General calculator on hash and you've now compare the hash that you came up with with the hash that was transmitted and signed and sealed by the original
sender. and if they match Does amazing things happen you have received a secret message and you verified who it came from? All in a way that no one else would be able to ever decipher break. Been so for both authenticity and for secrecy turns out that public key cryptography saws. Amazingly all of the encryption cryptography problems that we had for really thousands of years. Like we are living in a weird time right now where we can now chronically perform this kind of computation. It sold before public key cryptography. If you
need to move money between banks, for example, I want to send money from Chicago to New York. They need they did that over. Electronic lines, but they had to the banks in Chicago in New York at the each know what the encryption and decryption scheme was so had to previously been some secret conveyance of that into the bank's before that comes go to work. And as you can tell this doesn't really scale. Just by having to mail out or deliver the secret codes to all the banks all over the world and change them.
rotating basis almost impossible It was really not until the seventies when the RCA algorithms and the whole idea public key cryptography to cold. That then everything was allowed to explode terms of computerization and letting computation. Take the place. a photography so far so good. I'm almost about to wrap up on some for doing it on time. Okay now but just asking just a minute to those of you who might have actually been paying attention enough by crazy this year. This is
the line that GitHub I looked on that help page for from GitHub cuz here's how you should generate your SSH keys. And it's basically says I'm going to generate two keys using the RSA algorithm with a bit length of 4 4096 bits of heard this term length. How many bits are you supposed to use when you're generating these things? This number has changed and recommendation from time to time. Sometimes it's been 512-224-4096. Who knows if it'll ever stop. What is
this really referring to? Why does this number matter? When I think of it is that when you're encrypting something, let's say the text of a book or email message. What first happens is we take that message. think of it in a binary form just like a certain size earlier. And then you're going to take say every 10-12 bits at a time and scramble them and then move on to the next section of 10 or 12 minutes and then scrambled is this is a very common way of living 15 data color block Cipher because you take
blocks update at a time. RSA doesn't work like that. The RSA has to take your entire message as one block. And encrypt it. So if I have if I swear to specify a bit lengthy or of 16? I would only be able to encrypt a to bite message not so useful instead, you know for 4096 gives you a lot more characters to play with but it's still pretty Limited. So far for practical use right if I wanted to take a really big book and encrypt it and send it to someone I couldn't use for example my private key or my public key to encrypt that.
It's just way much bigger than 44000 bits with would give me to work with. So this doesn't really work that they can actually only encrypt messages that are short The Chi-Lites here by the Waze is actually not the length of your RSA key, if you would open up the public and private key files that you get from when you run the ssh-keygen, they are just text files by the way, you can open them up and they are readable you find the length of those messages is not 4096 pits. Anacondas words, like what in the world is going on? It turns out that what they call
the key length is actually the length of a mathematical number call the modulus that's going to get generated and plugged into the strike antic mathematical formula. That's really what this length is 4096 because of the math involved. There's exponents answers to module is arithmetic increasing the bit length by only a little bit actually give you an amazing amount of power. So Anyway, the thing with the RSA is that it's not actually encrypting all your content itself. Asymmetric algorithms like RSA are all so much slower than symmetric.
So what do we do we go back to using symmetric encryption for large things and use only RSA for short things Yes. Actually that's what we do. So modern photography. We use both will happen is we will use a public key cryptography to just encrypt a very short. number or key So what we will usually happen as we would generate some random hash and use RSA to encrypt that. And use that as the basis for all subsequent symmetric cryptography. This is how SSL or TLS works when you first connect to amazon.com the quickest change
between your computer and Amazon's computer are they randomly generated symmetric key, but in order to transmit that symmetric key without something able to be in the middle and eavesdrop on that and then be able to decrypt everything. That's the metric key is transmitted. Buy a public-key cryptography. All right. So once we've securely is transmitted the symmetric key now, that's a metric. He can be used for the duration of that transmission. So your whole session on Amazon that without lye.
Is that is going to be encrypted differently than your next session with Amazon and that's what we want. We want to keep rotating the cryptographic hash. All right. So currently we use public key cryptography there's a whole set of Standards. There's a lot of different algorithms RSA isn't the only one it is still the most popular But we're already seeing adoption of some completely different algorithms may have heard of elliptic curve. This gets into mathematics that are over my head, but
some elliptic curve algorithms have now been approved as part of the Hope public key standard. And this is just an attempt to keep ahead of the bad people write this whole thing is just an arms race. And there's a looming issue coming with all of this cause Quantum computing. So Microsoft has a great podcast on their their Microsoft research regarding security. and they've been talking about Quantum cryptography and Quantum Computing they've heard that quantum computers are revolutionize all this are classical computers that we currently use by taking
advantage of quantum physics and the problem that this poses for public key cryptography is that all of the public-private key business is hinged on the idea in mathematics that you have a super large. Number it's very hard to figure out the factors of that number. I'm so sorry. I gave you a number like 12 you can figure out what are the factors or two and three. What does the prime factors if I gave you a number like a million and nine it would take you sometime. There's no other way
other than Brute Force for a computer to figure out what the prime factors of a number r But under Quantum Computing that task might be solved. And suddenly public key cryptography could be broken very quickly. So turns out that there are some some implications coming within the next 10 years maybe 20 years on what this means on the state of perfect Rafi some experts are saying that actually not to worry. One of the downsides of quantum Computing is that symmetric encryption actually becomes difficult to break of all things. So we may have
to adjust the way that we do the encryption, but all is not lost. Others are saying that fears about Quantum Computing or unfounded anytime in the foreseeable future. And by the time we do have a solution we will also have figured out what kind of quantum computers cannot do very well and that will be a good hint as to how we should encrypt and decrypt secret things. And that's all I know about photography. But I only have a couple minutes left. So if you do have any questions, I'm going to hang out
here any question at all any question at all. I would love to talk about it, or if you want to remain more Anonymous if you want to ask me over Twitter, or do you and me over Twitter? That's totally fine. Don't be super private. You can always email me. I'm an adjunct at the University of Chicago and would love to talk about cryptography with you anytime. Thank you very much.
Buy this talk
Access to all the recordings of the event
Buy this video
With ConferenceCast.tv, you get access to our library of the world's best conference talks.