S1Lee

Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Not All Microseconds are Equal:Fine-Grained Per-Flow Measurements with Reference Latency Interpolation : 

Not All Microseconds are Equal:Fine-Grained Per-Flow Measurements with Reference Latency Interpolation Myungjin Lee†, Nick Duffield‡, Ramana Rao Kompella† †Purdue University, ‡AT&T Labs–Research

Low-latency applications : 

Low-latency applications 2 Several new types of applications require extremely low end-to-end latency Algorithmic trading applications in financial data center networks High performance computing applications in data center networks Storage applications Low latency cut-through switches Arista 7100 series Woven EFX 1000 series

Need for high-fidelity measurements : 

Need for high-fidelity measurements 3 At every router, high-fidelity measurements are critical to localize root causes Once root cause localized, operators can fix by rerouting traffic, upgrade links or perform detailed diagnosis

Measurement solutions today : 

Measurement solutions today 4 SNMP and NetFlow No latency measurements Active probes Typically end-to-end, do not localize the root cause Expensive high-fidelity measurement box Corvil boxes (£ 90,000): used by London Stock Exchange Cannot place these boxes ubiquitously Lossy Difference Aggregator (LDA) [Kompella, SIGCOMM’09] Provides average latency and variance at high-fidelity within a switch Provides a good start but may not be sufficient to diagnose flow-specific problems

Motivation for per-flow measurements : 

Motivation for per-flow measurements 5 Key observation: Significant amount of difference in average latencies across flows at a router Delay Time S/W … Queue Measurement period

Outline of the rest of talk : 

Outline of the rest of talk 6 Measurement model Alternative approaches Intuition behind our approach: Delay locality Our architecture: Reference Latency Interpolation (RLI) Evaluation

Measurement model : 

Measurement model 7 Assumption: Time synchronization between router interfaces Constraint: Cannot modify regular packets to carry timestamps Intrusive changes to the routing forwarding path Extra bandwidth consumption up to 10% capacity Ingress I Egress E

Naïve approach : 

Naïve approach 8 For each flow key, Store timestamps for each packet at I and E After a flow stops sending, I sends the packet timestamps to E E computes individual packet delays E aggregates average latency, variance, etc for each flow Problem: High communication costs At 10Gbps, few million packets per second Sampling reduces communication, but also reduces accuracy Ingress I Egress E 10 − = 20 23 27 30 + 15 13 18 − = 22 32 Avg. delay = 22/2 = 11 Avg. delay = 32/2 = 16 − + −

A (naïve) extension of LDA : 

A (naïve) extension of LDA 9 Maintaining LDAs with many counters for flows of interest Problem: (Potentially) high communication costs Proportional to the number of flows Ingress I Egress E LDA LDA LDA LDA LDA LDA Per-flow latency

Key observation: Delay locality : 

Key observation: Delay locality 10 LocaTrue mean delay = W(D1 + WD2 + WD3) / 3 Localized mean delay = (WD1 + WD2 + WD3) / 3 How close is localized mean delay to true mean delay as window size varies?

Key observation: Delay locality : 

Key observation: Delay locality 11 True Mean delay per key / ms Local mean delay per key / ms 0.1ms: RMSRE=0.054 10ms: RMSRE=0.16 1s: RMSRE=1.72 Data sets from real router and synthetic queueing model

Exploiting delay locality : 

Exploiting delay locality 12 Reference packets are injected regularly at the ingress I Special packets carrying ingress timestamp Provide some reference delay samples Used to approximate the latencies of regular packets

RLI architecture : 

RLI architecture 13 Component 1: Reference Packet generator Injects reference packets regularly Component 2: Latency Estimator Estimates packet latencies and updates per-flow statistics Estimates directly at the egress with no extra state maintained at ingress side (reduces storage and communication overheads) Egress E Ingress I 1 2 3 1 2 3 R

Component 1: Reference packet generator : 

Component 1: Reference packet generator 14 Question: When to inject a reference packet ? Idea 1: 1-in-n: Inject one reference packet every n packets Problem: low accuracy under low utilization Idea 2: 1-in-τ: Inject one reference packet every τ seconds Problem: bad in case where short-term delay variance is high Our approach: Dynamic injection based on utilization High utilization  low injection rate Low utilization  high injection rate Adaptive scheme works better than fixed rate schemes

Component 2: Latency estimator : 

Component 2: Latency estimator 15 Question 1: How to estimate latencies using reference packets Solution: Different estimators possible Use only the delay of a left reference packet (RLI-L) Use linear interpolation of left and right reference packets (RLI) Other non-linear estimators possible (e.g., shrinkage) L R

Component 2: Latency estimator : 

Component 2: Latency estimator 16 Avg. latency = C2 / C1 R L Right Reference Packet arrived Question 2: How to compute per-flow latency statistics Solution: Maintain 3 counters per flow at the egress side C1: Number of packets C2: Sum of packet delays C3: Sum of squares of packet delays (for estimating variance) To minimize state, can use any flow selection strategy to maintain counters for only a subset of flows

Experimental environment : 

Experimental environment 17 Data sets No public data center traces with timestamps Real router traces with synthetic workloads: WISC Real backbone traces with synthetic queueing: CHIC and SANJ Simulation tool: Open source NetFlow software – YAF Supports reference packet injection mechanism Simulates a queueing model with RED active queue management policy Experiments with different link utilizations

Accuracy of RLI under high link utilization : 

Accuracy of RLI under high link utilization 18

Comparison with other solutions : 

Comparison with other solutions 19 Packet sampling rate = 0.1%

Overhead of RLI : 

Overhead of RLI 20 Bandwidth overhead is low less than 0.2% of link capacity Impact to packet loss is small Packet loss difference with and without RLI is at most 0.001% at around 80% utilization

Summary : 

Summary A scalable architecture to obtain high-fidelity per-flow latency measurements between router interfaces Achieves a median relative error of 10-12% Shows 1-2 orders of magnitude lower relative error compared to existing solutions Measurements are obtained directly at the egress side Future work: Per-packet diagnosis 21

Thank you! Questions? : 

Thank you! Questions? 22

Backup : 

23 Backup

Comparison with other solutions : 

Comparison with other solutions 24

Bandwidth overhead : 

Bandwidth overhead 25

Interference with regular traffic : 

Interference with regular traffic 26

Impact to packet losses : 

Impact to packet losses 27