logging in or signing up planet 03 Aric85 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 149 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Exercising DHash with Distributed Backup: Exercising DHash with Distributed Backup Robert Morris Frans Kaashoek, David Karger, Russ Cox, Frank Dabek, Emil Sit, Josh Cates, Jacob Strauss, James Robertson http://pdos.lcs.mit.edu/chord MIT Laboratory for Computer Science Backup Characteristics: Backup Characteristics Full dump only once ever Then just incremental snapshots Dump reads a lot, to verify old blocks Data caching isn’t very useful Mostly bulk data movement Archived snapshots are on-line So interactive read also matters What do we deserve?: What do we deserve? Throughput: Same as one TCP? N TCPs? Use window of RPCs to cover latency Lookup latency: very cheap w/ proximity Read latency (w/ k copies): max / 2k Write latency: max Initial Performance: Initial Performance Bulk read: 50 kbytes/second Bulk write: 10 kbytes/second 90 nodes, Planetlab + RON 5 replicas of each block What went wrong?: What went wrong? Forced to talk to the predecessor Fetch from replica closest to predecessor Replicas are large, so puts were slow Unified congestion window Mimiced single TCP increase/decrease Worst-case timeouts, no fast retransmit Old Chord/DHash: Old Chord/DHash (key) predecessor, returns successor list replicas on successors 1. 4. 3. 2. originator Solutions: Solutions 7-of-14 coding w/ IDA, not replication Maintain proximity to originator Avoid predecessor, or any particular node Fetch data from nodes close to originator Predict better RPC timeouts Key tool: synthetic coordinates Vivaldi Synthetic Coordinates: Vivaldi Synthetic Coordinates Each node estimates its own position Position = (x,y): 'synthetic coordinates' x and y units are time (milliseconds) Distance predicts network latency Key point: predict w/o pinging first Like GNP, but no landmarks Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position 0,1 1,2 2,3 3,0 Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position 2 ms 1 ms 2 ms 1 ms Each node 'pings' a few other nodes to measure network latency (distance) A B Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position Each node 'pings' a few other nodes to measure network latency (distance) 2 1 2 1 Each nodes 'moves' to cause measured distances to match coordinates Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position Each node 'pings' a few other nodes to measure network latency (distance) 2 1 2 1 Each nodes 'moves' to cause measured distance to match coordinates Vivaldi in action: Vivaldi in action Execution on 86 PlanetLab Internet hosts Each host only pings a few other random hosts Most hosts find useful coordinates after a few dozen pings Vivaldi predicts latency well: Vivaldi predicts latency well New lookup+fetch: New lookup+fetch 1. 3. 2. API: lookup(k, m) returns first m successors of k Stops when successor list overlaps m C’s successor list A G F E D C B Vivaldi helps decrease DHash fetch times: Vivaldi helps decrease DHash fetch times 77 PlanetLab + 18 RON nodes Fundamental limit about 100 ms? DHash throughput: DHash throughput Berkeley Throughput (kbytes/second) Mazu NYU Australia Conclusions: Conclusions Unclear what 'fair' throughput is vs TCP Main latency costs: Data is not likely to be close Last steps of lookup are expensive Need latency predictions to as-yet unknown nodes Synthetic coordinates are useful in many ways http://pdos.lcs.mit.edu/chord Vivaldi vs. network coordinates: Vivaldi vs. network coordinates Vivaldi’s coordinates match geography well over-sea distances shrink (faster than over-land) orientation of Australia and Europe is 'wrong' You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
planet 03 Aric85 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 149 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Exercising DHash with Distributed Backup: Exercising DHash with Distributed Backup Robert Morris Frans Kaashoek, David Karger, Russ Cox, Frank Dabek, Emil Sit, Josh Cates, Jacob Strauss, James Robertson http://pdos.lcs.mit.edu/chord MIT Laboratory for Computer Science Backup Characteristics: Backup Characteristics Full dump only once ever Then just incremental snapshots Dump reads a lot, to verify old blocks Data caching isn’t very useful Mostly bulk data movement Archived snapshots are on-line So interactive read also matters What do we deserve?: What do we deserve? Throughput: Same as one TCP? N TCPs? Use window of RPCs to cover latency Lookup latency: very cheap w/ proximity Read latency (w/ k copies): max / 2k Write latency: max Initial Performance: Initial Performance Bulk read: 50 kbytes/second Bulk write: 10 kbytes/second 90 nodes, Planetlab + RON 5 replicas of each block What went wrong?: What went wrong? Forced to talk to the predecessor Fetch from replica closest to predecessor Replicas are large, so puts were slow Unified congestion window Mimiced single TCP increase/decrease Worst-case timeouts, no fast retransmit Old Chord/DHash: Old Chord/DHash (key) predecessor, returns successor list replicas on successors 1. 4. 3. 2. originator Solutions: Solutions 7-of-14 coding w/ IDA, not replication Maintain proximity to originator Avoid predecessor, or any particular node Fetch data from nodes close to originator Predict better RPC timeouts Key tool: synthetic coordinates Vivaldi Synthetic Coordinates: Vivaldi Synthetic Coordinates Each node estimates its own position Position = (x,y): 'synthetic coordinates' x and y units are time (milliseconds) Distance predicts network latency Key point: predict w/o pinging first Like GNP, but no landmarks Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position 0,1 1,2 2,3 3,0 Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position 2 ms 1 ms 2 ms 1 ms Each node 'pings' a few other nodes to measure network latency (distance) A B Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position Each node 'pings' a few other nodes to measure network latency (distance) 2 1 2 1 Each nodes 'moves' to cause measured distances to match coordinates Vivaldi synthetic coordinates: Vivaldi synthetic coordinates Each node starts with a random incorrect position Each node 'pings' a few other nodes to measure network latency (distance) 2 1 2 1 Each nodes 'moves' to cause measured distance to match coordinates Vivaldi in action: Vivaldi in action Execution on 86 PlanetLab Internet hosts Each host only pings a few other random hosts Most hosts find useful coordinates after a few dozen pings Vivaldi predicts latency well: Vivaldi predicts latency well New lookup+fetch: New lookup+fetch 1. 3. 2. API: lookup(k, m) returns first m successors of k Stops when successor list overlaps m C’s successor list A G F E D C B Vivaldi helps decrease DHash fetch times: Vivaldi helps decrease DHash fetch times 77 PlanetLab + 18 RON nodes Fundamental limit about 100 ms? DHash throughput: DHash throughput Berkeley Throughput (kbytes/second) Mazu NYU Australia Conclusions: Conclusions Unclear what 'fair' throughput is vs TCP Main latency costs: Data is not likely to be close Last steps of lookup are expensive Need latency predictions to as-yet unknown nodes Synthetic coordinates are useful in many ways http://pdos.lcs.mit.edu/chord Vivaldi vs. network coordinates: Vivaldi vs. network coordinates Vivaldi’s coordinates match geography well over-sea distances shrink (faster than over-land) orientation of Australia and Europe is 'wrong'