logging in or signing up ancs haoyu Mudki Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 85 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: January 03, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Packet Classification Using Coarse-Grained Tuple Spaces: Haoyu Song, Jon Turner and Sarang Dharmapurikar www.arl.wustl.edu Packet Classification Using Coarse-Grained Tuple SpacesOverview: Overview Two-dimensional packet classification problem in list of 2d filters, find first match for given address pair (1011,0111): [<101*,10*>, <10*,011*>, <1*,01*>] Limitations of current solutions fast algorithmic methods require excessive space (≥50x) TCAM has high cost per bit, significant power usage Combining cross-product and tuple-space search hybrid strategy with range of time-space tradeoff options Improving 1d lookups combining tree bitmap and Bloom filters Possible extensions Cross-Product Method: Cross-Product Method Procedure do 1d lookup on all fields combine results into lookup key in cross-product table direct lookup table or hash table Fast, but space grows as nk for n filters, k fields 10100, 01110 S0D1 2D Tuple Space Search: 2D Tuple Space Search Group by prefix length hash table per group up to 33 x 33= 1,089 groups in practice 30-100 occupied tuples Rectangle search markers to guide search at most 33 probes, often less hard to update Pruned tuple space search 1d search on src/dest fields find prefix lengths that match src/dest fields of packet search intersecting tuples if ≤k matching prefixes, at most k2 probesCoarse-Grained Tuple Space: Coarse-Grained Tuple Space Select coarse-grained partition of tuple space Build cross-product table per sub-space Search procedure 1d lookups for LPM probe each subspace terminate early if possible Pruning identify candidate sub-spaces during 1d lookup probe selected sub-spaces Space/time tradeoff Performance of Basic Algorithm: Performance of Basic Algorithm Equal size divisions of 2d tuple space Ratio of cross-products to filter set size 2x2 partition brings space usage to 2x minimum maximum of four probes required compared to 30-90 for simple tuple space search Pruning of limited use for filter sets of size <104 Performance of Best Configurations: Performance of Best ConfigurationsAlternate Partitioning Approaches: Alternate Partitioning Approaches Arbitrary sub-spaces are possible potential for fewer regions with good space efficiency Preliminary results mixed may be useful for smaller filter sets More evaluation needed Note: filters of form <prefix,*> and <*,prefix> stored in 1d data structuresFast 1d Lookups: Fast 1d Lookups Multibit trie Co-located children Bitmaps for prefix nodes subtree presence 4 bit stride implies 8 memory accesses Expand prefixes to “standard” lengths Off-chip hash table per length On-chip Bloom filters to avoid unproductive probes Large space requirements for good worst-case performance Tree Bitmap Hashing + Bloom FiltersFast and Compact 1d Lookups: Fast and Compact 1d Lookups Insert tree bitmap subtree roots into off-chip hash tables and on-chip Bloom filters Lookup prefix of subtree roots in Bloom filters if match on length k and all shorter lengths, probe off-chip table for length k Reduction in on-chip memory for Bloom filters shape-shifting trie yields further space reduction1d Lookup Performance: 1d Lookup Performance 200K IPv4 prefixes 5 bit stride for tree bitmap 8 bit on-chip “root table” 4 Bloom filters 1 BF entry for every 2 prefixes 1 off-chip probes (4 incl. FP) 2 Bloom filters 1 BF entry for every 6 prefixes 2 off-chip probes (4 incl. FP)Practical Configuration: Practical Configuration Configure 1d lookups for 1 off-chip probe each (excluding false positives) about 5 bits per prefix for Bloom filters with low FP rate Record <prefix,*> and <*,prefix> filters in 1d lookup data structures also proposed in recent paper by Kounavis, et. al. Divide remaining filters among four subspaces approximately 2 off-chip hash table entries per filter at most four probes With single QDR SRAM at 200 MHz, 32 bit word size can do 200 million probes per second about 33 million packets/second 40 byte packets at 10 Gb/sPossible Extensions: Possible Extensions More extensive evaluation scaling to larger filter sets – 100-200K filters integrated evaluation of 1d and 2d lookups systematic evaluation of alternate partitioning strategies Alternate representations of filter sub-spaces any filter set data structure is candidate using decision trees, can skip 1d lookups Generalization to more dimensions handling fields with ranges (for port numbers) coarse-grained grouping of tuple-spaces defined on “nesting level” can we beat TCAM? You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ancs haoyu Mudki Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 85 Category: Entertainment License: All Rights Reserved Like it (1) Dislike it (0) Added: January 03, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Packet Classification Using Coarse-Grained Tuple Spaces: Haoyu Song, Jon Turner and Sarang Dharmapurikar www.arl.wustl.edu Packet Classification Using Coarse-Grained Tuple SpacesOverview: Overview Two-dimensional packet classification problem in list of 2d filters, find first match for given address pair (1011,0111): [<101*,10*>, <10*,011*>, <1*,01*>] Limitations of current solutions fast algorithmic methods require excessive space (≥50x) TCAM has high cost per bit, significant power usage Combining cross-product and tuple-space search hybrid strategy with range of time-space tradeoff options Improving 1d lookups combining tree bitmap and Bloom filters Possible extensions Cross-Product Method: Cross-Product Method Procedure do 1d lookup on all fields combine results into lookup key in cross-product table direct lookup table or hash table Fast, but space grows as nk for n filters, k fields 10100, 01110 S0D1 2D Tuple Space Search: 2D Tuple Space Search Group by prefix length hash table per group up to 33 x 33= 1,089 groups in practice 30-100 occupied tuples Rectangle search markers to guide search at most 33 probes, often less hard to update Pruned tuple space search 1d search on src/dest fields find prefix lengths that match src/dest fields of packet search intersecting tuples if ≤k matching prefixes, at most k2 probesCoarse-Grained Tuple Space: Coarse-Grained Tuple Space Select coarse-grained partition of tuple space Build cross-product table per sub-space Search procedure 1d lookups for LPM probe each subspace terminate early if possible Pruning identify candidate sub-spaces during 1d lookup probe selected sub-spaces Space/time tradeoff Performance of Basic Algorithm: Performance of Basic Algorithm Equal size divisions of 2d tuple space Ratio of cross-products to filter set size 2x2 partition brings space usage to 2x minimum maximum of four probes required compared to 30-90 for simple tuple space search Pruning of limited use for filter sets of size <104 Performance of Best Configurations: Performance of Best ConfigurationsAlternate Partitioning Approaches: Alternate Partitioning Approaches Arbitrary sub-spaces are possible potential for fewer regions with good space efficiency Preliminary results mixed may be useful for smaller filter sets More evaluation needed Note: filters of form <prefix,*> and <*,prefix> stored in 1d data structuresFast 1d Lookups: Fast 1d Lookups Multibit trie Co-located children Bitmaps for prefix nodes subtree presence 4 bit stride implies 8 memory accesses Expand prefixes to “standard” lengths Off-chip hash table per length On-chip Bloom filters to avoid unproductive probes Large space requirements for good worst-case performance Tree Bitmap Hashing + Bloom FiltersFast and Compact 1d Lookups: Fast and Compact 1d Lookups Insert tree bitmap subtree roots into off-chip hash tables and on-chip Bloom filters Lookup prefix of subtree roots in Bloom filters if match on length k and all shorter lengths, probe off-chip table for length k Reduction in on-chip memory for Bloom filters shape-shifting trie yields further space reduction1d Lookup Performance: 1d Lookup Performance 200K IPv4 prefixes 5 bit stride for tree bitmap 8 bit on-chip “root table” 4 Bloom filters 1 BF entry for every 2 prefixes 1 off-chip probes (4 incl. FP) 2 Bloom filters 1 BF entry for every 6 prefixes 2 off-chip probes (4 incl. FP)Practical Configuration: Practical Configuration Configure 1d lookups for 1 off-chip probe each (excluding false positives) about 5 bits per prefix for Bloom filters with low FP rate Record <prefix,*> and <*,prefix> filters in 1d lookup data structures also proposed in recent paper by Kounavis, et. al. Divide remaining filters among four subspaces approximately 2 off-chip hash table entries per filter at most four probes With single QDR SRAM at 200 MHz, 32 bit word size can do 200 million probes per second about 33 million packets/second 40 byte packets at 10 Gb/sPossible Extensions: Possible Extensions More extensive evaluation scaling to larger filter sets – 100-200K filters integrated evaluation of 1d and 2d lookups systematic evaluation of alternate partitioning strategies Alternate representations of filter sub-spaces any filter set data structure is candidate using decision trees, can skip 1d lookups Generalization to more dimensions handling fields with ranges (for port numbers) coarse-grained grouping of tuple-spaces defined on “nesting level” can we beat TCAM?