HPCS2010 MulticoreSlides 198

Category: Entertainment

Presentation Description

No description available.


Presentation Transcript

Slide 1: 

1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer

Without easy and efficient parallel programming methods… : 

2 Johannes Schneider “computer science will become washing machine science.“ Without easy and efficient parallel programming methods…

How to handle access to shared data? : 

How to handle access to shared data? Locks, Monitors… Coarse grained vs. fine grained locking easy but slow program demanding, time consuming but fast programs Problems difficult error prone Composability … Johannes Schneider lock all data modify/use data unlock all data lock A lock B modify/use A,B lock C modify/use A,B,C unlock A modify/use B,C unlock B,C lock B lock A modify/use A,B unlock A,B Deadlock! Only 1 thread can execute 3 Thread 1 Thread 2

Transactional memory(TM) - a possible solution : 

Transactional memory(TM) - a possible solution Simple for the programmer Composable Idea from database community Many TM systems (internally) still use locks But the TM system (not the programmer) takes care of Performance Correctness (no deadlocks...) Johannes Schneider Begin transaction modify/use data End transaction Method A.x() Begin Transaction B.y() … End Transaction Method B.y() Begin transaction … End transaction 4

Transactional memory systems : 

Transactional memory systems If transactions modify different data, everything is ok the same data, conflicts arise that must be resolved Transactions might get delayed or aborted Job of a contention manager A transaction keeps track of all modified values It restores all values, if it is aborted A transaction successfully finishes with a commit Johannes Schneider 5

Conflicts – A contention manager decides : 

Abort or delay a transaction, i.e. adapt load Distributed Each thread has its own manager Example Initially: A=1, B=1 Manager 1 Manager 2 T1 Trans. 1 T1 Trans. 2 B:=2 … A:=3 … conflict … A:=2 … ‏ Abort (undo all changes, i.e. set A:=1)‏ and restart (after a while) T1 Trans.1 … A:=2 … Trans. 2 B:=2 … A:=3 … conflict Abort (set B:=1) and restart OR wait and retry Conflicts – A contention manager decides Johannes Schneider 6 Manager 1 Manager 2 Delay to adapt load!

Prior work : 

Prior work Contention Managers [PODC03,PODC05,ISAAC09…] System load was not (explicitly) considered Load adaption Estimate contention intensity: CI [SPAA08] If abort: CI = a CI + (1-a) with parameter a [0,1] If commit: CI = a CI If CI > parameter b then resort to central scheduler Keep a transaction queue per core [PODC08] Central dispatcher assigns transactions to a core, i.e. its queue Each core iteratively executes transactions from queue If transaction A on core 1 is aborted due to B on core 2 then A is appended to the queue of core 2 Central scheduler will become a bottleneck Johannes Schneider 7 Core 1 Core 2 A B C D Core 1 Core 2 A B C D B aborts A

This paper : 

This paper Theoretical analysis Decentralized (simple) approaches to load adaption Johannes Schneider 8

Strategies : 

Strategies Do not learn from conflicts ImmediateRestart Remember faced conflicts SerializeFacedConflicts Do not schedule prior conflicting transactions concurrently Assume many additional conflicts SerializeAll All transactions in a subgraph are assumed to conflict Johannes Schneider 9 B A D C Conflict graph A conflicted with C D conflicted with B A D C B A D C B A C B D

Load Adaption (Cont. Mang.) Strategies : 

Load Adaption (Cont. Mang.) Strategies AbortBackoff If aborted wait for a random time [0,2#aborts] Priority = number of aborts #aborts Who wins a conflict? 2 strategies Estimate the work done Unrelated to work done Johannes Schneider 10

Theory Part - Model : 

Theory Part - Model n transactions (and threads) Start concurrently on n cores Transaction sequence of operations operation takes 1 time unit duration (number of operations) tT is fixed 2 types of operations Write = modify (shared) resource and lock it until commit Compute/abort/commit Johannes Schneider 11 Core 1 Core 2 B A Core n Z … A

Moderate parallelism : 

Moderate parallelism Shared counter Conflicts directly after transaction start Linked List Conflicts at arbitrary time (Worst case) Time span until all transactions committed Johannes Schneider 12 Transaction run time #transactions

Substantial parallelism : 

Substantial parallelism Conflict graph is d-ary tree of logarithmic height Exponential gap: SerializeAll and others Johannes Schneider 13 T1 T2 T3 T4 T5 …

Practical investigation : 

Practical investigation Remembering conflicts causes too much overhead Good for analysis but not for implementation Quickadapter Serializes transactions Each core has a “waiting” flag If aborted, set flag and wait until flag unset If commit, unset some flag AbortBackOff (Also considered some variants) Johannes Schneider 14

Practical investigation : 

Practical investigation Evaluation on 16 core machine 4 x 4 quad core DSTM2 system Six benchmarks Little parallelism Shared counter, Sorted List (accessed objects not released), Listcounter Considerable parallelism Red Black Tree, LFUCache, RandomAccessArray Compare new load adaption policies to existing contention managers Johannes Schneider 15

Discussion : 

Discussion Hard to keep maximum throughput, also in [SPAA08, PODC08] Improvement for 1 benchmark worsens another On average better than schemes without load adaption 16 Johannes Schneider

Conclusion : 

Conclusion Simple and distributed load adaption strategies Theory (For now) constants matter a lot Practice Hard to keep load at peak for all usage patterns 17 Johannes Schneider

Slide 18: 

Thanks for your attention! Questions? 18 Johannes Schneider

Analysis AbortBackoff for counter : 

Analysis AbortBackoff for counter Recall: If aborted wait for a random time [0,2#aborts] Assume #aborts ~ log (ntT) + x (for some x) Define: a(x) := fraction of active nodes a(0) = 1 (after time ~2log (ntT) = ntT a constant fraction still active) Chance conflict for interval [0,2#aborts] Interval [0, 2log(ntT)+x ] ~ a(x) ntT / 2log (ntT) +x = a(x) /2x a(x+1) = a(x)/2x = 1/2∑i=0..x i ~ 1/2x2 a(√log n) = 1/2(√log n)2 = 1/n ∑i=0.. log (ntT) +√log n length interval = ∑i=0.. .. log (ntT) +√log n 2i = ntT 2√log n+1 Johannes Schneider 19 T1 T2 T3 a(x)ntT = 3/n n tT = 3tT

authorStream Live Help