logging in or signing up 51 Natalia Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 23 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 05, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Tornado: Maximizing Locality and Concurrency in a SMMP OS: Tornado: Maximizing Locality and Concurrency in a SMMP OSContents : Contents Types of Locality Locality: A closer look Requirements for locality Design Basics of Tornado Test Results ConclusionTypes of Locality*: Types of Locality* Temporal locality “The concept that a resource that is referenced at one point in time will be referenced again sometime in the near future.” Spatial locality “The concept that the likelihood of referencing a resource is higher if a resource near it has been referenced.” Sequential locality “The concept that memory is accessed sequentially.” *Source: WikipediaLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 Processor # 2 Cache Cache x MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache Memory Notes: No accesses on the bus Because accesses are reads that are satisfied in local caches and no invalidations are sentLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 Cache x Memory bool x = true; while (x) { x = false; // Do other // work… }Locality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… }Locality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } Invalidate block containing xLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache missLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. DataLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. Data 4. Write 5. Invalidate block containing x Notes: x becomes a bottleneck, the valid copy keeps jumping from one cache to the other Every write access causing invalidation Almost every read causing a read miss and a bus readLocality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: x & y have different addresses but fall into the same cache line (block)!Locality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Read doesn’t cause any problemLocality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Remember: Invalidations are per cache-line/block not word! So we have pretty much the same behavior as the read/write case on a single variable Invalidate block containing x & yRequirements for Locality: Requirements for Locality Spatial and temporal locality Minimizing read/write and write sharing Minimize false sharing Minimize the distance between the accessing processor and the target memory module.Design Basics for Tornado: Design Basics for Tornado Individual resources are individual objects Clustering objects Protected procedure calls (PPC) Semi-automatic garbage collectionClustered Objects: Clustered Objects Appears as a single object from the outside but is internally split into reps Each rep handles requests from one or more processors Lots of advantages to this designClustered Objects (cont.): Clustered Objects (cont.) Per-processor translation tables Partitioned global translation table Default “miss” handlersProtected Procedure Calls: Protected Procedure Calls Microkernel: relies on servers to carry on part of the OS job As many server threads as there are clients A request is handled on the same processor where it was issued *Image source: WikipediaGarbage Collection: Garbage Collection Semi-automatic Makes distinction between temporary and persistent references to objects Eliminates the need for two locks to guarantee existence and locking altogether for read only dataTest Results: Effect of rep Count (1): Test Results: Effect of rep Count (1)Test Results: Effect of rep Count (2): Test Results: Effect of rep Count (2)Test Results: Effect of Cache Associativity: Test Results: Effect of Cache AssociativityTest Results: Tornado vs. Commercial OSes: Test Results: Tornado vs. Commercial OSesConclusion: Conclusion Tornado performs much better than many commercial OSes The concept of clustered objects gives it a lot of advantage High locality of data Diminished need for locking Higher degree of sharing, concurrency and modularity You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
51 Natalia Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 23 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 05, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Tornado: Maximizing Locality and Concurrency in a SMMP OS: Tornado: Maximizing Locality and Concurrency in a SMMP OSContents : Contents Types of Locality Locality: A closer look Requirements for locality Design Basics of Tornado Test Results ConclusionTypes of Locality*: Types of Locality* Temporal locality “The concept that a resource that is referenced at one point in time will be referenced again sometime in the near future.” Spatial locality “The concept that the likelihood of referencing a resource is higher if a resource near it has been referenced.” Sequential locality “The concept that memory is accessed sequentially.” *Source: WikipediaLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 Processor # 2 Cache Cache x MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache MemoryLocality: A closer look, Read only case: Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 x Processor # 2 x x Cache Cache Memory Notes: No accesses on the bus Because accesses are reads that are satisfied in local caches and no invalidations are sentLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 Cache x Memory bool x = true; while (x) { x = false; // Do other // work… }Locality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… }Locality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } Invalidate block containing xLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache missLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. DataLocality: A closer look, Read/Write case: Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x Processor # 2 x x Memory bool x = true; while (x) { x = false; // Do other // work… } 2. Read request 1. Cache miss 3. Data 4. Write 5. Invalidate block containing x Notes: x becomes a bottleneck, the valid copy keeps jumping from one cache to the other Every write access causing invalidation Almost every read causing a read miss and a bus readLocality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: x & y have different addresses but fall into the same cache line (block)!Locality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Read doesn’t cause any problemLocality: A closer look, Effect of Cache Line Length: Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 x,y Processor # 2 x Memory bool y = true; while (y) { y = false; // Do other // work… } y 0x0 0x4 x,y Notes: Remember: Invalidations are per cache-line/block not word! So we have pretty much the same behavior as the read/write case on a single variable Invalidate block containing x & yRequirements for Locality: Requirements for Locality Spatial and temporal locality Minimizing read/write and write sharing Minimize false sharing Minimize the distance between the accessing processor and the target memory module.Design Basics for Tornado: Design Basics for Tornado Individual resources are individual objects Clustering objects Protected procedure calls (PPC) Semi-automatic garbage collectionClustered Objects: Clustered Objects Appears as a single object from the outside but is internally split into reps Each rep handles requests from one or more processors Lots of advantages to this designClustered Objects (cont.): Clustered Objects (cont.) Per-processor translation tables Partitioned global translation table Default “miss” handlersProtected Procedure Calls: Protected Procedure Calls Microkernel: relies on servers to carry on part of the OS job As many server threads as there are clients A request is handled on the same processor where it was issued *Image source: WikipediaGarbage Collection: Garbage Collection Semi-automatic Makes distinction between temporary and persistent references to objects Eliminates the need for two locks to guarantee existence and locking altogether for read only dataTest Results: Effect of rep Count (1): Test Results: Effect of rep Count (1)Test Results: Effect of rep Count (2): Test Results: Effect of rep Count (2)Test Results: Effect of Cache Associativity: Test Results: Effect of Cache AssociativityTest Results: Tornado vs. Commercial OSes: Test Results: Tornado vs. Commercial OSesConclusion: Conclusion Tornado performs much better than many commercial OSes The concept of clustered objects gives it a lot of advantage High locality of data Diminished need for locking Higher degree of sharing, concurrency and modularity