logging in or signing up Resource Consumption in Network Traffic Abbott Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 254 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Automatically Inferring Patterns of Resource Consumption in Network Traffic: Automatically Inferring Patterns of Resource Consumption in Network Traffic Cristian Estan, Stefan Savage, George Varghese University of California, San Diego Who is using my link?: Who is using my link? Looking at the traffic: Do something smarter! Too much data for a human Looking at the traffic Looking at traffic aggregates: Looking at traffic aggregates Aggregating on individual packet header fields gives useful results but Traffic reports are not always at the right granularity (e.g. individual IP address, subnet, etc.) Cannot show aggregates defined over multiple fields (e.g. which network uses which application) The traffic analysis tool should automatically find aggregates over the right fields at the right granularity Most traffic goes to the dorms … What apps are used? Where does the traffic come from? …… Which network uses web and which one kazaa? Ideal traffic report: Ideal traffic report Web is the dominant application The library is a heavy user of web That’s a big flash crowd! This is a Denial of Service attack !! This paper is about giving the network administrator insightful traffic reports Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Approach: Approach Characterize traffic mix by describing all important traffic aggregates Multidimensional aggregates (e.g. flash crowd described by protocol, port number and IP address) Aggregates at the the right level of granularity (e.g. computer, subnet, ISP) Traffic analysis is automated – finds insightful data without human guidance Definition: traffic clusters: Definition: traffic clusters Traffic clusters are the multidimensional traffic aggregates identified by our reports A cluster is defined by a range for each field The ranges are from natural hierarchies (e.g. IP prefix hierarchy) – meaningful aggregates Example Traffic aggregate: incoming web traffic for CS Dept. Traffic cluster: ( SrcIP=*, DestIP in 132.239.64.0/21, Proto=TCP, SrcPort=80, DestPort in [1024,65535] ) Definition: traffic report: Traffic reports give the volume of chosen traffic clusters To keep report size manageable describe only clusters above threshold (e.g. H=total of traffic/20) To avoid redundant data compress by omitting clusters whose traffic can be inferred (up to error H) from non-overlapping more specific clusters in the report To highlight non-obvious aggregates prioritize by using unexpectedness label Example 50% of all traffic is web Prefix B receives 20% of all traffic The web traffic received by prefix B is 15% instead of 50%*20%=10%, unexpectedness label is 15%/10%=150% Definition: traffic report Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Algorithms and theory: Algorithms and theory Algorithms and theoretical bounds in the paper Unidimensional reports are easy to compute Multidimensional reports are exponentially harder as we add more fields Next few slides Example of unidimensional compression Example for the structure of the multidimensional cluster space Unidimensional report example: Unidimensional report example 10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.5 10.0.0.8 10.0.0.9 10.0.0.10 10.0.0.14 15 35 30 40 160 110 35 75 Hierarchy Threshold=100 10.0.0.14/31 10.0.0.12/30 Unidimensional report example: Unidimensional report example 10.0.0.8 10.0.0.9 10.0.0.0/29 10.0.0.8/29 120 380 160 110 Compression 305-270andlt;100 380-270≥100 Multidimensional structure ex.: Multidimensional structure ex. Nodes (clusters) have multiple parents US Web Nodes (clusters) overlap CA Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience System: AutoFocus: System: AutoFocus Traffic parser Web based GUI Cluster miner Grapher Packet header trace Slide17: Slide18: Slide19: Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Structure of regular traffic mix: Backups from CAIDA to tape server Semi-regular time pattern FTP from SLAC Stanford Scripps web traffic Web andamp; Squid servers Large ssh traffic Steady ICMP probing from CAIDA Structure of regular traffic mix SD-NAP SD-NAP Analysis of unusual events: Analysis of unusual events UCSD to UCLA route change Sapphire/SQL Slammer worm Site 2 Conclusions: Conclusions 1010111101010000101011111101011001010101101011010000101010100101010111101010101000101111010000010111111101011001010111010111100100101010100011011111100010101110110101100101010110101111000010101011110111010111010101010111111010110010101011010101111101010000110100001011010100101011001000000101011001010101011111000010001000010101011110101000010111001010101101011110000010101011111101011000101111010000010111110101011010111100100101010110010101010001010100101010110101010010111001010000010100001110110101010110111111000101011101011101011001010101101011110000110111101110101110101010101111110101100101010110101111011101010000110101010010101101010111010101001010000101011010101001010100000101010101010101101011101010100000010101010101101010101011110101110101011010100011000101010010111010101001101010100001000110101111010100010110 Conclusions: Conclusions Multidimensional traffic clusters using natural hierarchies describe traffic aggregates Traffic reports using thresholding identify automatically conspicuous resource consumption at the right granularity Compression produces compact traffic reports and unexpectedness labels highlight non-obvious aggregates Our prototype system, AutoFocus, provides insights into the structure of regular traffic and unexpected events Thank you!: Thank you! Alpha version of AutoFocus downloadable from http://ial.ucsd.edu/AutoFocus/ Any questions? Acknowledgements: NIST, NSF, Vern Paxson, David Moore, Liliana Estan, Jennifer Rexford, Alex Snoeren, Geoff Voelker Bounds and running times: Bounds and running times Open questions: Open questions Are there tighter bounds for the size of the reports? Are there algorithms that produce smaller results? Are there algorithms that compute traffic reports more efficiently? In streaming fashion? Delta reports: Delta reports Why repeat the same traffic report if the traffic doesn’t change from one day to the other? Delta reports describe the clusters that increased or decreased by more than the threshold from one interval to the other On related traffic mixes delta reports much smaller than traffic reports Multidimensional compression very hard for delta reports We have only exponential algorithm for the cluster delta Greedy compression algorithm: Greedy compression algorithm Multidimensional report example: Multidimensional report example Thresholding Compression System details: System details You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Resource Consumption in Network Traffic Abbott Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 254 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Automatically Inferring Patterns of Resource Consumption in Network Traffic: Automatically Inferring Patterns of Resource Consumption in Network Traffic Cristian Estan, Stefan Savage, George Varghese University of California, San Diego Who is using my link?: Who is using my link? Looking at the traffic: Do something smarter! Too much data for a human Looking at the traffic Looking at traffic aggregates: Looking at traffic aggregates Aggregating on individual packet header fields gives useful results but Traffic reports are not always at the right granularity (e.g. individual IP address, subnet, etc.) Cannot show aggregates defined over multiple fields (e.g. which network uses which application) The traffic analysis tool should automatically find aggregates over the right fields at the right granularity Most traffic goes to the dorms … What apps are used? Where does the traffic come from? …… Which network uses web and which one kazaa? Ideal traffic report: Ideal traffic report Web is the dominant application The library is a heavy user of web That’s a big flash crowd! This is a Denial of Service attack !! This paper is about giving the network administrator insightful traffic reports Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Approach: Approach Characterize traffic mix by describing all important traffic aggregates Multidimensional aggregates (e.g. flash crowd described by protocol, port number and IP address) Aggregates at the the right level of granularity (e.g. computer, subnet, ISP) Traffic analysis is automated – finds insightful data without human guidance Definition: traffic clusters: Definition: traffic clusters Traffic clusters are the multidimensional traffic aggregates identified by our reports A cluster is defined by a range for each field The ranges are from natural hierarchies (e.g. IP prefix hierarchy) – meaningful aggregates Example Traffic aggregate: incoming web traffic for CS Dept. Traffic cluster: ( SrcIP=*, DestIP in 132.239.64.0/21, Proto=TCP, SrcPort=80, DestPort in [1024,65535] ) Definition: traffic report: Traffic reports give the volume of chosen traffic clusters To keep report size manageable describe only clusters above threshold (e.g. H=total of traffic/20) To avoid redundant data compress by omitting clusters whose traffic can be inferred (up to error H) from non-overlapping more specific clusters in the report To highlight non-obvious aggregates prioritize by using unexpectedness label Example 50% of all traffic is web Prefix B receives 20% of all traffic The web traffic received by prefix B is 15% instead of 50%*20%=10%, unexpectedness label is 15%/10%=150% Definition: traffic report Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Algorithms and theory: Algorithms and theory Algorithms and theoretical bounds in the paper Unidimensional reports are easy to compute Multidimensional reports are exponentially harder as we add more fields Next few slides Example of unidimensional compression Example for the structure of the multidimensional cluster space Unidimensional report example: Unidimensional report example 10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.5 10.0.0.8 10.0.0.9 10.0.0.10 10.0.0.14 15 35 30 40 160 110 35 75 Hierarchy Threshold=100 10.0.0.14/31 10.0.0.12/30 Unidimensional report example: Unidimensional report example 10.0.0.8 10.0.0.9 10.0.0.0/29 10.0.0.8/29 120 380 160 110 Compression 305-270andlt;100 380-270≥100 Multidimensional structure ex.: Multidimensional structure ex. Nodes (clusters) have multiple parents US Web Nodes (clusters) overlap CA Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience System: AutoFocus: System: AutoFocus Traffic parser Web based GUI Cluster miner Grapher Packet header trace Slide17: Slide18: Slide19: Contributions of this paper: Contributions of this paper Approach Definitions Algorithms System Experience Structure of regular traffic mix: Backups from CAIDA to tape server Semi-regular time pattern FTP from SLAC Stanford Scripps web traffic Web andamp; Squid servers Large ssh traffic Steady ICMP probing from CAIDA Structure of regular traffic mix SD-NAP SD-NAP Analysis of unusual events: Analysis of unusual events UCSD to UCLA route change Sapphire/SQL Slammer worm Site 2 Conclusions: Conclusions 1010111101010000101011111101011001010101101011010000101010100101010111101010101000101111010000010111111101011001010111010111100100101010100011011111100010101110110101100101010110101111000010101011110111010111010101010111111010110010101011010101111101010000110100001011010100101011001000000101011001010101011111000010001000010101011110101000010111001010101101011110000010101011111101011000101111010000010111110101011010111100100101010110010101010001010100101010110101010010111001010000010100001110110101010110111111000101011101011101011001010101101011110000110111101110101110101010101111110101100101010110101111011101010000110101010010101101010111010101001010000101011010101001010100000101010101010101101011101010100000010101010101101010101011110101110101011010100011000101010010111010101001101010100001000110101111010100010110 Conclusions: Conclusions Multidimensional traffic clusters using natural hierarchies describe traffic aggregates Traffic reports using thresholding identify automatically conspicuous resource consumption at the right granularity Compression produces compact traffic reports and unexpectedness labels highlight non-obvious aggregates Our prototype system, AutoFocus, provides insights into the structure of regular traffic and unexpected events Thank you!: Thank you! Alpha version of AutoFocus downloadable from http://ial.ucsd.edu/AutoFocus/ Any questions? Acknowledgements: NIST, NSF, Vern Paxson, David Moore, Liliana Estan, Jennifer Rexford, Alex Snoeren, Geoff Voelker Bounds and running times: Bounds and running times Open questions: Open questions Are there tighter bounds for the size of the reports? Are there algorithms that produce smaller results? Are there algorithms that compute traffic reports more efficiently? In streaming fashion? Delta reports: Delta reports Why repeat the same traffic report if the traffic doesn’t change from one day to the other? Delta reports describe the clusters that increased or decreased by more than the threshold from one interval to the other On related traffic mixes delta reports much smaller than traffic reports Multidimensional compression very hard for delta reports We have only exponential algorithm for the cluster delta Greedy compression algorithm: Greedy compression algorithm Multidimensional report example: Multidimensional report example Thresholding Compression System details: System details