Events Add an event Speakers Talks Collections
 
SIGCOMM 2020
August 12, 2020, Online, New York, NY, USA
SIGCOMM 2020
Request Q&A
SIGCOMM 2020
From the conference
SIGCOMM 2020
Request Q&A
Video
VTrace: Automatic Diagnostic System for Persistent Packet Loss
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Add to favorites
465
I like 0
I dislike 0
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
  • Description
  • Transcript
  • Discussion

About the talk

Persistent packet loss in the cloud-scale overlay network severely compromises tenant experiences. Cloud providers are keen to automatically and quickly determine the root cause of such problems. However, existing work is either designed for the physical network or insufficient to present the concrete reason of packet loss. In this paper, we propose to record and analyze the on-site forwarding condition of packets during packet-level tracing. The cloud-scale overlay network presents great challenges to achieve this goal with its high network complexity, multi-tenant nature, and diversity of root causes. To address these challenges, we present VTrace, an automatic diagnostic system for persistent packet loss over the cloud-scale overlay network. Utilizing the "fast path-slow path" structure of virtual forwarding devices (VFDs), e.g., vSwitches, VTrace installs several "coloring, matching and logging" rules in VFDs to selectively track the packets of interest and inspect them in depth. The detailed forwarding situation at each hop is logged and then assembled to perform analysis with an efficient path reconstruction scheme. Experiments are conducted to demonstrate VTrace's low overhead and quick responsiveness. We share experiences of how VTrace efficiently resolves persistent packet loss issues after deploying it in Alibaba Cloud for over 20 months.


00:12 Packet losses in the overlay network

00:59 Automatic diagnosis of root causes is desired

02:12 Design goals and requirements for VTrace

02:46 Related work

02:59 Solution - Overview

04:50 Our design: flexible tracing +in-depth process

06:01 Our design: path reconstruction

06:48 Evaluation-impact on VFDs

07:34 Evaluation - diagnostic time

09:23 Takeaways

10:02 Questions and answers

About speaker

Chongrong Fang
PhD Student at Zhejiang University
Share

Hi, all I'm talking about today. I'm going to share with and work because they say that you're interested but carried out by a cloud. Either over a network of cloud, computing systems considering interview ABC Communications nereo, tenants can configure is a virtual forwarding devices that weighs which weighs and how is our own BBC Network such as ACL configurations. Besides cloud service providers can also configure update during these actions. Both tenants and a cloud service providers can lead it to affect the losses due to be at the Falls at Niagara Falls.

This packet loss issues can reflect classified in persistent and a chance and packet loss, has he ever folks on processing Factory glasses since they're more intelligent? But my packet loss happens in the oven in that work. For example, here is a novelty fault, I'm going to be a router went an extensive programme. There goes to women's diagnostic request, which included information or suspected flowers and operations Engineers of the cloud service provider we're used to be down but to manually. Check the vfd is traversed by those flowers. And the result, for

example, the nearest reprise that beer out at is jumping package but it works well maybe 10 ratio. Stash reprise. Not satisfying to Tanners for more detailed information. The network experts fear, further check and the analyzer would cause finally decided that it was that Miss configuration of a CLS such with cost, is useful to quickly troubleshoot Network programs. However, the man with the hundreds, or even more such technology request every day. Death row in an automatic system is indeed Waco. It be Chase. Ceviches are ghosts at OK, Google locate the

cowboy to be happy and the president of the root cause for a fact, the loss, to achieve this Coast beaches to extract the outside. Forwarding conditional, alpacas to reflect is a root cause with affordable performance. Lost tenants are not required to have a permit visualize, the Kiki challenge of making this girl, and a requirement is how to balance blood. The trade-off between the toes crossed Technologies and as a low impact of providing performance, There is a vast body of prep work related to a research Focus. However, is that not sufficient to make all

these requirements? Are we designed to automate the measure of garments? To automatically diagnosed with causes of persistent actor. Lost our solution is a Mainland, composed of Sri essential steps, that inspector chasing in-depth processing and the past because Junction. First four pack of the chasing and endives processing and I buy Tia and Tamera package at a Samsung to a dedicated servers log systems for further analysis. However, such a design laxative, check,

alpacas and I cannot figure out of the old side forward and conditions. So can we have to use before open in tabs processing for package? Actually, the 4th in mordovia piece includes the Fastpass and a stove, has the slope as can be used to process package in tabs, to take me to Primanti Brothers. In the VIP slow, past to extract outside for the condition of packets in production, we have deployed more than 400 people probes to cover possible back to Los. However, such people processing. It

reported your pants are perfect processing which is not acceptable. We must have bad as the trade-off between root cause technology and performance loss. Is this word concerns the process and the fact the Lost checking a subset of attacking. The packets made enough for the Ruth Coast technology is according to the Brahmin experience. Usually checking pass to hangers at Target packets are enough. In this way. The forwarding performance can be efficient that reduced but how can a regular such a restful chasing an in-depth processing?

Specifically for the first to have. We have tea, the controller tells it the number of package to be charged with a parameter packed account. And bank account is not to rage. Give the remarks a package. So it's the sap value. And example, form the Deep chat by Deepak probes. At the last day of the warlocks. Information and the root causes to local love Asian Vlog. Includes necessary projects such as the five top o n s o. L v e. R useful for the later Pass road

construction. For the following dnb's, they were performed. The people processing for pancakes with the mark to get on Saturday causes and their record this information into logs, It should be noted, as a, such a design, leverages the basic functionality lb and Purdy's, and the only occupies a small fraction of Royal table with this plugs, generated the next week instead of the pasta packets. Are you supposed to shave for Metro construction? Is this Logs with Dad for the apology? However

is only natural to part your tongue black. Skeleton amick, besides the net worth the pain, it may not be available. Sometimes has this illusion is sufficient, but not necessary. And the group in the tax ID and sorting but I is not sufficient say so. So different servers are out of synchronization. In which case it performs a password construction based on the information contained in the lost data such as Mark for the Ingress or the first of reality, the information of servers were beautifully into a States work.

Wakened out of several tests on Via routers and the Beast, which is faster have a boat and we have to use the same as a production work. For me, routers boy, falls out of his almost no job in the wedding rate for packages over 250 6, byes NFL package smaller than to 128 series at monster to what percentage of in voting rate. It is acceptable for small packets and a package in the beard is traffic usually, have a package sizes, larger than 256. Pies at the most, a 1%

job approval rating rating only, which means that it's intact average. Price of a wedding rate is acceptable. Not with the beach Ace. One Saginaw still request. Comes Tennessee. Will get a quick and the use of it back in a few minutes. There is a vitiated playable. The answer is yes. We have deployed which rising out of the cloud for over 20 months. During the deployment week, are some interesting findings will use Adobe chest first. According to the speaker, we found that most of the time, and I still requests a false

positives only, a few of them to have a pact to those issues besides. It can be seeing that during the surge in the number of each as being called did it. Because Justin online children carnivals from the early September to the end of December, has the resources and pressure test, to the upcoming huge business as a result. Best an authority operations will lead to the increase of each test and passed at the Los Angeles. Another fine that you have. Now, we have two, foes are usually more common than we have divorce. They said

because I re-up before, we will avoid it. It's something like that before. Deployment weather look like the root causes for now. We are divorced and they classified as a mentor. Five times, the proportion of each time is provided in a statement saying that we switch raise a mora Pro. Then we are all terrorists and we also found his attitude cause it's like 10 and a security policy broken and tenant configuration error as a major reason for the Panthers lost also found no cause has indicated the necessity of maintaining and updating the reporter 845 probes.

This book, the most rated effectiveness of the carnival matching and a logging idea in automatically. The nursing process of the time. The Los Angeles which includes more than surgery like data centers and has been used to form more than $10 times in the future. We are going to Dennis Chan said that the Lost in the oven in that world in cricket, which opposite ends affect the jobs in both physical and emotional. Growth is also an interesting and important program. This is our presentation. Thank you very much. Play the

thank you tomorrow. We are waiting for questions to pour in. Let me start a discussion. I want to still look, did you need to put some safeguards put on some of your vehicle or this was never an issue for you? Yes. Can you give me? Okay, thank you. I'll leave right away. She has proven to have a affordable impact on the floor, where the performers? And a carbon today, which is tasked can only be issued by Cloud operations at the nearest. So Sylvia not be too much 18

loads or somewhere. We have to use now but however in the future to the riches, H eyes are open to tenants when they encounter this program, then as witch hazel on to mount it or we can overcome this program by limiting the number of the show, start at the same time to is icing. I have answer the question. Most diagnostic queries are false positives. And the question is, do you have any insights on why these false positives happen and what are they Okay, and I am I understanding those forced those false positives of which artist is most of the Wichitas

we should attack. Now to request, it comes from the tenants extra, they do not understand the Natural Curiosity. Sometimes space things they have a national program, but when were you ceviches? It's a return. The result of a chance to the Natural is good. because sometimes a private job saved because of a The. Sorry, there's a false positives and what causes what causes false positives. I same as that is because sometimes they are. Let's let's take it one step at a time.

Sorry, how many issues do you think you say hundreds of issues? How many issues movies witches? It depends. Or maybe you have a more precise answer. the two issues for vicious is not Korean out what to do to have sex about, maybe hundreds of Yeah, maybe some of them were in, 3D, correct. Some of the question is what happens when there's a positive, what happens when you couldn't even find and an issue with the address? And so that's for the core challenge of these diagnostic systems.

If you have some high-level thoughts, that would be great if you could share. or we can we can take it off mine, if you go back to the National, there's another question from when you have any cases where the automatic diagnostic failed and the experts who, how to make a second pass and if yes, what type of these type of these issues Yes. As long as we have indicated in the in the video that I just used some unknown causes ceviches, tell a story The Pasta Los Padres, don't know why it is.

Why is Apatosaurus happens lazy because of these programs are not covered by the developer of deployed SO2 of, this is your girl still maintaining and updating is a repository of the valve prolapse. A day. That's okay, thank you. Text effects of the turkey. We already have a question on those Locker for my way. That is asking different questions. So, in the past, we already have a lot of tools that will cut the full song of misbehavior on the physical. Next Wednesday, you have to focus on the PS4 Network.

That's because I'm do with additional towards for a physical nature of the body. Okay, thanks for the question, is to obtain a lost happens in the peanut and focus on physical Network, only operative. I spotted cannot figure out the root cause for the pastor, lost them, like, why the packet is in here. So, the main focus of us 82, Sophie coronavirus, cause And is the challenges in in the virtual network, is that the network topology in the world. And that world is, is that they make and contrast at because I, creating and

destroying frequently saw that the project, and they make for a free second narrows. It is hard to determine the float as of the path of our Target package. So this is a challenge and another one and I can I do that is over in that work is the root cause for the fact, the loss is Is rice. Exam for those requests can both tenants and the cloud service providers can lead to pass the loss because the post was a mechanic on figure is this way, she is and the beer out. So it is a very challenging to automatically knows that

the root cause of the loss. And in terms of the opportunity, that's World standing or sitting being. Am I understanding where we had? We have used the DSG divider, two boxes that I get pasta when we have to check and see if I can be also used in the Philippine at work. So we think beach has can Can extend, can the expanded if we wake and integrated physical? Physical chasing tools by the packet.

Cackle comments for the website

Buy this talk

Access to the talk “VTrace: Automatic Diagnostic System for Persistent Packet Loss”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Ticket

Get access to all videos “SIGCOMM 2020”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Ticket

Interested in topic “IT & Technology”?

You might be interested in videos from this event

November 9 - 17, 2020
Online
50
107
future of ux, behavioral science, design engineering, design systems, design thinking process, new product, partnership, product design, the global experience summit 2020, ux research

Similar talks

Gautam Kumar
Staff Software Engineer at Google
+ 1 speaker
Nandita Dukkipati
Principal Engineer at Google
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Nick McKeown
Professor at Stanford University
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Matt Mathis
Research Scientist at Google
+ 1 speaker
Jamshid Mahdavi
Senior Software Engineer at WhatsApp Inc
+ 1 speaker
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Buy this video

Video
Access to the talk “VTrace: Automatic Diagnostic System for Persistent Packet Loss”
Available
In cart
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free
Free

Conference Cast

With ConferenceCast.tv, you get access to our library of the world's best conference talks.

Conference Cast
946 conferences
37606 speakers
14373 hours of content