About the talk
For modern data center switches, the ability to---with minimum latency and maximum flexibility--- react to current network conditions is important for managing increasingly dynamic networks. The traditional approach to implementing this type of behavior is through a control plane that is orders of magnitude slower than the speed at which typical data center congestion events occur. More recent alternatives like programmable switches can remember statistics about passing traffic and adjust behavior accordingly, but unfortunately, their capabilities severely limit what can be done.
In this paper, we present Mantis, a framework for implementing fine-grained reactive behavior on today's programmable switches with the help of a specialized reactive control plane architecture. Mantis is, thus, a combination of language for specifying dynamic components of packet processing and an optimized, general, and safe control loop for implementing them. Mantis provides a simple-to-reason-about set of abstractions for users, and the Mantis control plane can react to changes in the network in 10s of μs.
Hi today I'll be talkin about month is a framer for implementing expressive and find green reactive behavior is on top of today's programmable switches. This is a joint word. What's Johnstone shock? And Vincent do Works. A common task is to react to Corinth at work and dishes examples of this type of behaviors in it acting failures. And then rerouting around them, identify malicious Lowe's and then filtering based on the security policies and recognizing loading balance and adjusting the traffic distribution. All the ways. Share the same
structure measuring current road conditions and they're updating the network devices. Behaviors to match them. In data centers. These reactions need to be fast. Open. Stop. Rtt level. Baby, I'll see you ways to implement these type of reactive Behavior. Has traditionally one of the control plane was on Pulaski and Scotch, conventional controls, these approaches of flexible but tick way, they're slow enough. And Orders of magnitude slower to capture the transient Network bats, Beyond the general approach is to
directly integrated reactions into the tailor play Hardware. These are fast but limited by the current Hardware capabilities. Know, you might think programmable switches don't fit into this category because they are reconfigurable yet. Unfortunately, there's still a well-know set of limitations. For example, these constraints include limited operations and branching allowing actions on the veil ability to manipulate match action table entries on the table plans. So long over the years, previous work, has, several workarounds of
these were crowned have drawbacks with sample. One can leverage recirculation to achieve during completeness that he could significantly cheaper than usable through port on the switch in the N. These workarounds on non-trivial to design and also consume extensive switch resources even if they are possible. In this work, we explore the question, can we Maybelline our reactions to both capture microseconds, level events and provide the most flexibility? I will approach is to push the reaction loop as close to the switch a sick as possible.
Meanwhile co-design that they'll plan program for fine-grained malleability and ease-of-use. Has represented monsters at the center of Montes, is a reaction Loop where the mountains control plane pose measurement and updates portions of the date of playing at the granularity of cans of microseconds. The reaction logic off of control play KNBR which receipt code. specifying this behavior is the P4 language, a simple extension to pay for the heat, Mabel's user to specify which portions of people program should be malleable and
to defy the reaction logic Amante's compiler. Then transformed, the people are called in to keep our buckles ensuring runtime reconfiguration and serializability of the reaction. The Tuco abstractions, online Montes, to tease and reaction. For the rest of the presentation, will walk through each component of Monday's by taking a top-down approach. First, I would reduce how to express reacted, behaviors with people are using a concrete example. Let's start with a simple people called snippet, hears action, and
table that bass on the destination. Address has assigned the priority level What if we want to reconfigure this priority based on the tear ducts? First we declare a Malibu entity, for example, a malleable value, that takes 16b value and gets initialise to 1. Next, we replace a constant value with reference to the Define malleable value. Define reaction function, specifies of data playmetrics to pull the Abba Tracy code to computer control Logic on general purpose. If you and the reconfiguration,
Austin malleable entities with a simple syntax. The above is just a simple example, in turns out that besides multiple values and also Dynamic way we can figure all the people objects such as feels and tables. Size ratchet, strong humans. Wonka also specify other day to play objects, such as hetero, meditative feels volleyball feels and use them as if they were see, there goes all raised. Now, I'm going to talk about how to translate people are into something before I bought your auntie switches. One of the primary goals of Monday's
translation is to create the people program that is dynamic way. Reconfigurable that is without interrupting their point and losing States. Again, let's look at the previous example. To accomplish this first month, is replaces the malleable value free of war with a concrete. People object, korpi for metal, 10 instances of P4 in a table at the beginning of each packet processing Pipeline with actions to SAS is Bali. Configuring. This way means that by changing a single table country-specific Lady of epiphany table
Montes, can reconfigure all uses of a malleable value in the entire pipeline transformation of other Malibu. And it says, request for the steps with more details in the paper. Now we've seen how to translate a symbol p for a program with a single operation. I'm going to talk about more complex operations and a concurrency issues that result to see why concurrency potentially causes issues to see the reaction function. That pose the source and destination addresses of a packet. A user in my rear Blake's, back to the source and
destination Jurassic single single weight ratio of a reaction function will come from the same packet. P1, however, it happened during the first against the value of sores from packet, preppy 1% during the second pole because the value of destination for another packet Picchu. That's not isolated measurement to break the semantics of the reaction logic similar problems. Arise while dating multiple entities in the same reaction. Address the issue. Montes provides per pipeline. Her reaction serializable isolation between measurements updates and
packet processing. Know that the isolation I'm talking about here is the isolation of acid, which is to say that all three types of transactions appear to ask you in some sequential order though. In fact they are concurrent, this makes it easier to reason about the reactive behavior is the approach to ensure isolation. I mean, Santa video and paper. Let's talk about the mounties control. Play lunches control playing rounds on switches on bossy. Pues Santos, which has already used for tasks such as routing
and configuration. However, these interactions are traditionally assumed to be on a slow path. Instead of his control players. Reaction Centric design for repeated his and rise. To the switching music explicit operations into a prologue face and dialogue face for the pro walk recompute as much as you can and it's all instruments the reaction. This design, along with aggressive optimizations, allow monkeys to ask you the reaction at granularities it on the same order of magnitude as the PCI latency of the alliance system. Waiting for amended among this prototype
around. Saw to Fino switch. When a man says he found that Monday is a cheese fast reaction time of tens of microseconds, besides month is renters, loaded playing and CPU overhead and suppose wide range of supported. The applications more details on waist and a video on paper. To summarize Montes, introduces fine-grained reaction to that was statistics as first-class citizens mother's provides a single to reasonable set of reaction. Instructions with before interface enables a wide range of support. If he reacted behaviors, without
penalizing the line, ray packet processing speeds. Thank you so much for attention. Thank you for a really cool system in a great talk. I just wanted to point out. One thing that I neglected to mention when introducing this paper, as well as the first paper in this session, T, all three badgett is available for a call and also Resort by others, which is great. I also love that the organization is. So, one question from the Q&A already, what are the trade-offs for using mantis,
in particular, with respect to pipeline resources. How much extra it cost to use to use the system? Okay, thank you for question. Yeah. As a teacher on a question, so I guess I won't answer the second question first. So yeah, it's true that we have before. I need a table at the beginning of the pipeline. However, it doesn't necessarily like at two extra stage because the pipeline, the compiler and then basically, that's open. By the way, I'm Eliza Tennessee's. And only if like, the my folding
tables that all depends on that configuration will be allocated to the, to the lot like the stages have to work. So it's but his true unless I ask for extra table cost for that and for the other trade-offs, probably one caused that month is as is likely ask extra busy Cory at the CPU. However, that could be traded which is showing now extended video and paper as well and others. Father, memory cost, for example at the Rendezvous point is that probably the most expensive one
is to use the Bible. Phil, as we mention the table, I have a steel scarcely and leave me away. Respect to respect to the number of usage. Yeah, thank you. Great. So another question just long-term about the language, I see you have some language exemptions to before. Are you planning to contribute these to the people respect? Where do you think few people are as its own dialect that will live on independently of the language? So the question is, are we planning to extend
it before? We haven't actually take that stuff, to be honest. But it is like we believe that this nice extension when the user or programmer wants to implement a reactive Behavior. So, another question from, you know, I'm shy from Alibaba, interesting work, and is curious about scalability not from a pipeline or performance situation, but from the perspective that how much code you would write without mantis, and then how much you write with mantis using your compiler to, to generate positive.
We have a usage table that are kind of like benchmarks the lines of code that they say everything before. And then the lines of code that is written on, that is generated in CMP for. So some way, the amount of increase is dependent on the application to follow that it could be to ask, for example. Thank you for your presentation during chasing talk. Let me see. Let me ask the first question. So you say you use the smellable values as the parameter to, to react to the network.
Could you give us some hints about? Who is the one that can change the malleable values in your system? Oh, so the question is how all the motherboard entities are reconfigured? So the Monkees agent, also, a short answer to that is domantas, Asians on takes care of. So, so basically the user specifies, the mother by entities in the date of play and are in the people calling people are called and also a reaction function that will be done to make it to the maugus agent asked you that including the
reconfiguration of the modified I see. So is it possible to have multiple values working together to support more flexible configuration? Yes, definitely. So that's actually one benefit of using mother's. Yeah, cuz they're usually doesn't need you. Think about like the synchronized communication between the control and data plan because he's actually taken care of. By the month is a compiler, which generates Eliza Belen 3, computer code, you have to smile about values. Have you thought about if there are different applications they want to modify the
volume? At the same time will be. There are some complications between different applications wanting to modify the value simultaneously. So saying does this exist on my application? That maybe it's the same time, they will come play with each other and also. So that comes like hell when you have multiple operations at the same time and was Monday's. Provides is a civilized by isolation for that. I provide the expectation for the user that they are asked you to bring me some sequential order. So so that makes it easier to reason about reactive Behavior.
Yeah, so that Oh sure. So actually in the paper, we have several example, use cases, such as failover and security applications. So, with more details actually, in a paper someone is asking what is the CPU usage in your experiment is also currently, are we? So if you are to achieve, why I almost asked if you are apples, a latency, I mean, Optimizer latency all the map, the CPR class for that is one basic, or a, however, that could be traded at you, in that case, with your Chief, like 15 microseconds for like, like, 20% of usage,
Buy this talk
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.