About the talk
Cloud services are deployed in datacenters connected though high-bandwidth Wide Area Networks (WANs). We find that WAN traffic negatively impacts the performance of datacenter traffic, increasing tail latency by 2.5x, despite its small bandwidth demand. This behavior is caused by the long round-trip time (RTT) for WAN traffic, combined with limited buffering in datacenter switches. The long WAN RTT forces datacenter traffic to take the full burden of reacting to congestion. Furthermore, datacenter traffic changes on a faster time-scale than the WAN RTT, making it difficult for WAN congestion control to estimate available bandwidth accurately.
We present Annulus, a congestion control scheme that relies on two control loops to address these challenges. One control loop leverages existing congestion control algorithms for bottlenecks where there is only one type of traffic (i.e., WAN or datacenter). The other loop handles bottlenecks shared between WAN and datacenter traffic near the traffic source, using direct feedback from the bottleneck. We implement Annulus on a testbed and in simulation. Compared to baselines using BBR for WAN congestion control and DCTCP or DCQCN for datacenter congestion control, Annulus increases bottleneck utilization by 10% and lowers datacenter flow completion time by 1.3-3.5x.
Ahmed Saeed is a postdoctoral associate at MIT working with Prof. Mohammad Alizadeh. He completed my PhD in August 2019 at Georgia Tech, where he was advised by Prof. Mostafa Ammar and Prof. Ellen Zegura. During his PhD, he interned several times at Google, where he collaborated with Nandita Dukkipati and Amin Vahdat. He received my bachelors degree in Computer and Systems Engineering from Alexandria University in 2010.View the profile
Thank you for doing. So I talked to my team and this talk is about my work release center and when traffic this work until this talk overlaps with a more detailed talk available online. Just enough to get you familiar with. Then I'll answer the three most frequently asked questions about animals. Let me start by providing some service providers are investing in High Castle events to connect Sanchez from the link to how rapidly network has been growing. The growth
in infrastructure is motivated by the exponential growth in traffic jam. At this figure shows one of their results reported by Google in the paper. A couple of years ago over the fate of my ear is not coming from other when things to have similar to have the same Rack or even the same machine as well as high-bandwidth Minecraft is not in the highest. Net worth is a high volumes of traffic between to the sermon, by the sender, using a flag that's based on the destination of the flu.
When traffic is is wind condition control. For the past decade, we've been developing when and it isn't accomplishing control over them, separate the difference. Between, on one side Netflix have fruits, I could you give me the progress of the movie starting from the HTC M9, have a mix of the bench and also a side from private to and deployment of new one condition control. Recent proposals for one condition control, Super Bowl v e, r s e, c and numbers. In the stalker concerned with bomb mixture between
one and it isn't the traffic. Even the type of traffic is using its own recording that heater of winning this into traffic. Now from one such cluster today that was collected over the period of two days. When traffic exiting, the trustor, the medium length percentile and the 99th percentile agency of old data center traffic in the closet. The show that has went right to demand increases the delinquency of data center traffic Warden. We also examine the drop reject
or switches, which is the first point of oversubscription in the neck. Brace if we study, we see that the drug tree trimmed also following man, This means that when the men can significantly impact and degrade the performance of the different time. Boost Mobile are finding is that when traffic reaction to congestion is too slow compared to the size dinosaur or something and resulting performance decoration. For both went and distance. You can refer to the laundry with you
and the explanation of the implications of this. Now, introduce our scheme, which achieved when they compete for men with medicine ball. Answers to question. How do you efficiently manage ball next for this sensor and when traffic complete the second question is, how do you answer? The first question was handling all the types of bone? Army ideas, answer. The first question is to take the dog food cause of profound impairment, when the center, and when traffic on teeth that is the store
reaction time of one traffic, time to pass feedback. We answer the second question leveraging. Existing Albert, more concretely. I know this is a new air condition control of skin addresses traditional types of congestion. But your line is used for Winfield Easter ball next trip between when and they sent the traffic congestion, or the ball next. Which generates a message that sent to the source of the tracker and the new source control. I'm going to spit in Corinth, Texas, Palo, Verde have congestion control algorithms for when a distance of traffic due to algorithms and locate
package with a base in Great, Bend on some condition signal. Analyst and a neutral control switch. And then to another place in Great Danes on calculations, This limited supplies. The minimum peak in great. I'm just a bug. Are you trying to the paper or the longer video for more details of endless? Or is it now in the first question is, why the switch answer this question? You considered a group of dedicating buffers, play some different types of track. Aside from the waist of litigating and register. Same Source. Isolation is not the
kind of see it while you can help improve the performance of this. Since it does not solve the problems faced by the sympathy isn't in a buffer has placed in a Datacenter switch to absorb iron. Bdp recall that were concerned here will high-bandwidth when close capable of utilizing terabits per second of a few aggregate. This means that when traffic to switch, this means that isolation lead to excessive drop when they closed due to competition with This is made worse by the fact that when traffic was dead. Sure. Been with relation to traffic
and change hundreds of times within a single, when are they due to? The fact, the issue of packaged dropping caused by lack of buffer state. This figure shows the results from our simulation. It shows the normalized relations, your base into traffic and the normalized average want to listen to try to connect with you. I went to the short answer is that Annabelle is Target's new source ball next only and does not handle ball. Next near the receiver. Longer answer is the main idea of animus is to cut the feedback delay of one traffic by the feedback directly
from the ball next week to the traffic stores, This does not apply for a new receiver ball next in the feedback. Do they still around the party? That isn't behind the ball next door? The more pressing problem, in these cases were up there, the practice, of course, nearest Super Bowl, mix exist and addressing them would require a different approach than endless. I just got such an approach in the detail, talk available online. Is that question is, how did he design the New York, sort of control over? How do we make the design choice of deflector? Answer the question of birth given overview
of how the new source control Loop works. I'm Cuban. I listen to switch generates alive condition. That is related to a Sunday. The message indicates, the problematic flow and the extent of the congestion. Descendant, Wicked East Mission rate of the chromatic joke based on the condition leprechaun bait in the congestion. As you can see this designs 30 generic it's cute. That has to be delivered directly from the bar next to the traffic Source you can probably think of many ways to realize that has control of can behave like a HTC. See if the
limited information delivered directly from the bar next which to the traffic Source practicality, we rely on the standard which is already implemented. In Detroit, hardware switches, provide light signaling we need for the new source control. In conclusion in this work, we are going to find a new problem in this traffic compete with the truck. And this makes the case for developing better than signals that your new day action, time and improve the performance of web traffic when I'm
inside. Loss of muscle control, condition control where close to significantly different types of one. Thank you, and I'll take and thank you. I'm at. So, while I wait for questions from people, actually now there's a question. So, I'm as curious about that, I don't know, sounds conceptually similar to the multi-domain condition control sensor for mobile networks, where people used Apple, control Loop and fast. Key back from the base station, what are the key differences between the two scenarios and slippery to to
say that under control algorithms work, is the deer line connecting terminations and you can correct me if I'm wrong on this but the difference between MLS and such approaches lines, connection termination is that analysts does not terminate the connection. The edits are all made at still at the stores. You can get the feedback. The feedback about the feedback loop at the source. So so that's that's the name, that's the main difference. So, another question coming from tongli is that there's a mechanic to be splitting? Why is it that began Tuesday to discredit
the best braids with question? And that's related to the starting to the previous question? Any, any wedding wear any shirt like the division of state of the connection is not really feasible with them. Also it's not feasible with them that they short latency. Architectures of data center Medford. Like just think about that apology off at its Center Network and and and adding such little boxes. Some brands of apology Majestic. Why are you thinking? But apology and me thinking how how we keep the latency for. So we just wanted as like so that means idea
for a is to try to keep the the latency in the recent arrest records possible, wynonie of matching existing Solutions, not changing the topology not changing the the the hardware too much Good. There's a reason question and I'm trying to parse it, but let me be there by 10. So the question is, how can you determine in the car? Jollity between hi Rancho, but I'm introducing the latency. I mean, High correlation between them. Sometimes can be misleading. This is a great question. And this is actually
why the first edition of the speaker is ejected this year in question. And we like the whole section though, if Mike if you want the details answer, you can just read this section. I'll give you sort of the punchline and that is the reason for that is that is he not just correlation the first one is that there's not enough buffer space for for about 20 deep inside of it and that's sort of natural even for a private Wednesday, said, you liked Merchants elephants which
is like a special case is that the data center traffic and that's the second reason traffic changes. The available bandwidth for web traffic at very high frequencies. So consider like the time from Target with an order of my 2nd. And that's very small compared to when are cheap. This means that once a Wednesday that's available to keep bugging me. And all of those packets have to be back for it somewhere and there's not enough about that and we have a lot of
stimulation where we try to isolate the traffic traffic when you try different problem. That only happens when high-bandwidth wife traffic compete with this Center traffic Thank you. That was very interesting of we have a couple of questions coming on Black Ops 3 does analyst acquired such modifications, call, Maurice, Richard, I need to change anything about the port switches, but if we need to make the signal more expressive where we need to change the
logic of the, then that might require online traffic, and then you wouldn't need to worry about peanut butter. Well, the sending of of credit based system is this still will acquire some information arising from the receiver to the to the to the sender. And that's, that's what's the arrival time at a fairly large party. I don't have an answer for this. Like I know that some of rock comes, which is supported this Rikishi, and by default, but I'm not sure. Like, if if, if that's not worth it is it is, it is, it is available.
So I'm not sure if I like the switches would be available for like, when setting. So I'm not sure if they would support such critical but it's just kind of weird. I saw, I think there's another question coming up. And in the meantime, I had a question of a mention, how, like strict isolation between don't have enough of a capacity, but let's say that you had shared across different. If you don't have any because in the traffic delay in traffic, we talked
about him. And if you don't have some traffic, if we just occupied space in the issue of bad interaction between the two go away, is that it's not only about sharing the buffer space, it's also watching bandwidth. So when the problem happens when the center traffic changes, Within this isn't a party. So, this means that within a single, when our kitty that will prevent low and change and that means that you get the bikes fighting off in the distance to switch when the, when the
defensive line in prison and no matter how much fun for you, you have like the only case where, where it where the Easter and this goes away if we have like very large buffers in Dixon, Get back. They'll just sort of flappers. Combining PVE with analysts would provide better sending great exclamation. Poopy music. By The Used like that of The great questions. I think I think this is there is probably some interesting research questions there, where an end to sort of the last
computer where I think if if a single floor is going to significantly different types of ball. Next with my needs actually like, like not just your control of things, but but Monty control of schemes were each control of, get it, sold feedback, mmmm sex on the ball and then they're like the defender was just, I think, I think there is, there are some interesting facts about stability and control. Great, thanks. So I think we can continue with Slackers. So thank you. Interesting work and
Buy this talk
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.