logging in or signing up lecture26 fall 2003 Abbott Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 120 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 19, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Lecture 26Asynchrony, Shared Objects and Concurrent Programming: Lecture 26 Asynchrony, Shared Objects and Concurrent Programming This material is not available in the textbook. The notes in the online powerpoint presentation contains much of the explanations given in class. Uniprocessor vs. Multiprocessor Machines: Uniprocessor vs. Multiprocessor Machines P1 P2 P10 P1 memory Shared memory Uniprocessor MultiprocessorConcurrent Programming: Concurrent Programming Challenge: Coordinating access to items in shared memory.Persistent vs. Transient Communication: Persistent vs. Transient Communication Persistent Communication medium: the sending of information changes the state of the medium forever. Example: Blackboard. Transient communication medium: the change of state is only for some limited time period. Example: Talking. Testing Many Primes Concurrently: Testing Many Primes Concurrently Task: Print all primes from 1 to 1010 in some order Available: A machine with 10 processors Goal: Achieve speed up factor of 10, that is, new time to print all primes should be one tenth of time for single processor. We will partition the inputs among the 10 processors. But what is the best way to do this?Testing Many Primes Concurrently: Testing Many Primes Concurrently P1 P2 P10 1 109 2x109 1010 Simplest way to split the inputs among processors: Each processor Pi gets 109 consecutive numbers to test. … …Testing Many Primes Concurrently:Code for Simplest Partition: Testing Many Primes Concurrently: Code for Simplest Partition (define (P i) (let ((counter (+ 1 (* (- i 1) (power 10 9)))) (upto (* i (power 10 9)))) (define (iter) (if (< counter upto) (begin (if (prime? counter) (display counter) #f) (increment-counter) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10))Load (Un)balancing: work is split unevenly: Load (Un)balancing: work is split unevenly Some processors have less primes to test… Some composite numbers are easier to test… P1 P2 P10 1 109 2x109 1010 Even this rather simple problem is hard to treat analytically. To achieve load balancing, we need to split the work range dynamically!A Shared Counter Object: A Shared Counter Object (define (make-shared-counter value) (define (fetch) value) (define (increment) (set! value (+ 1 value)) (define (dispatch m) (cond (((eq? m 'fetch) (fetch)) (eq? m 'increment) (increment)) (else (error “unknown request”)))) dispatch) (define shared-counter (make-shared-counter 1)) This shared counter object can be accessed by two operations: “fetch” (read) and “increment” (restricted form of write). Using the Shared Counter: Using the Shared Counter (define (P i) (define (iter) (let ((index (shared-counter 'fetch))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (shared-counter 'increment) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10))This Solution Doesn’t Work: This Solution Doesn’t WorkIt Will Never Work!? : It Will Never Work!? Real World Example: Walking in the Street Look / Move Fetch / Increment Computers: Accessing Memory Fischer Lynch & Paterson: Impossible to solve!!! Need to “glue” the Fetch / Increment pair into one indivisible operation: Fetch-and-IncrementThe Fetch-and-Increment Operation: The Fetch-and-Increment Operation (define (make-shared-counter value) (define (fetch-and-increment) (let ((old value)) (set! value (+ old 1)) old)) (define (dispatch m) (cond (((eq? m 'fetch-and-increment) (fetch-and-increment)) (else (error ``unknown request -- counter'' m)))) dispatch) Instantaneous Shared Counter Fetch-and-incA Correct Shared Counter: A Correct Shared Counter (define shared-counter (make-shared-counter 1)) (define (P i) (define (iter) (let ((index (shared-counter 'fetch-and-increment))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10)) Implementing Fetch-and-Inc: Implementing Fetch-and-Inc To make the program work we need an “instantaneous” implementation of fetch-and-increment. How can we do this: Special Hardware. Built-in synchronization instructions. Special Software. Use regular instructions -- the solution will involve waiting. Software: Mutual Exclusion Mutual Exclusion: Mutual Exclusion (mutex 'start) (let ((old value)) (set! value (+ old 1)) old) (mutex 'end)) Only one process at a time can execute these instructions P1 P2 P10 1 1 P2 returns 1 Mutex countThe Story of Alice and Bob(these Alice and Bob are unrelated to Rivest, Shamir and Adelman, the new recipients of the Turning award) : The Story of Alice and Bob (these Alice and Bob are unrelated to Rivest, Shamir and Adelman, the new recipients of the Turning award) * As told by Leslie LamportThe Mutual Exclusion Problem: The Mutual Exclusion Problem Requirements: Mutual Exclusion: there will never be two dogs simultaneously in the yard. No Deadlock: if only one dog wants to be in the yard it will succeed, and if both dogs want to go out, at least one of them will succeed.Cell Phone Solution: Cell Phone SolutionCoke Can Solution: Coke Can SolutionFlag Solution -- Alice: Flag Solution -- AliceFlag Solution -- Bob: Flag Solution -- BobFlag Solution -- Both: Flag Solution -- BothIntuition: Why Mutual Exclusion is Preserved: Intuition: Why Mutual Exclusion is Preserved Each perform: First raise the flag, to signal interest. Then look to see if the other one has raised the flag. One can claim that the following flag principle holds: since Alice and Bob each raise their own flag and then look at the others flag, the last one to start looking must notice that both flags are up. Why is there no Deadlock?: Why is there no Deadlock? Since Alice has priority over Bob…if neither is entering the critical section, both are repeatedly trying, and Bob will give Alice priority. Unfortunately, the algorithm is not a fair one, and Bob's dogs might eventually grow very anxious :-)The Morals of our Story: The Morals of our Story The Mutual Exclusion problem cannot be solved using transient communication. (I.e. Cell-phones.) The Mutual Exclusion problem cannot be solved using interrupts or interrupt bits (I.e. Cans) The Mutual Exclusion problem can be solved with one bit registers (I.e. Flags), memory locations that can be read and written (set!-ed). We cheated a little: the arbiter problem… While moving, is the flag really up or down?The Solution and Conclusion: The Solution and Conclusion (define (Alice) (loop (mutex 'begin) (Alice-dog-in-yard) ;; critical section (mutex 'end) )) Question: then why not execute all the code of the parallel prime-printing algorithm in a critical section? Answer: Amdahl’s Law: Answer: Amdahl’s Law Speedup Overall = Execution_time_Old / Execution_time_New = 1 Fraction_Enhanced Speedup_Enhanced 1- Fraction_Enhanced + If 40% of the execution time can be sped-up by a factor of 10 using 10 processors. Then Overall Speedup = 1/(0.6 + (0.4/10)) = 1/0.64 = 1.56 Length of Critical Sections : Length of Critical Sections We want Linear Speedup: 100 processors = 100 times faster By Amdahl's law that means that Fraction_Enhanced = 1 No Sequential Parts!!!!! For a speedup of only 99 times: 1-Fraction_Enhanced= 0.0001 In other words, we must try and make the parts that are sequential (the critical sections) as short as possible!!!!! Summarizing it all: Summarizing it all To summarize it all, our world is asynchronous, and yet with a bit of luck we have ways of overcoming this asynchrony to create powerful concurrent algorithms. “Life is the synchronicity of chance” You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
lecture26 fall 2003 Abbott Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 120 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 19, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Lecture 26Asynchrony, Shared Objects and Concurrent Programming: Lecture 26 Asynchrony, Shared Objects and Concurrent Programming This material is not available in the textbook. The notes in the online powerpoint presentation contains much of the explanations given in class. Uniprocessor vs. Multiprocessor Machines: Uniprocessor vs. Multiprocessor Machines P1 P2 P10 P1 memory Shared memory Uniprocessor MultiprocessorConcurrent Programming: Concurrent Programming Challenge: Coordinating access to items in shared memory.Persistent vs. Transient Communication: Persistent vs. Transient Communication Persistent Communication medium: the sending of information changes the state of the medium forever. Example: Blackboard. Transient communication medium: the change of state is only for some limited time period. Example: Talking. Testing Many Primes Concurrently: Testing Many Primes Concurrently Task: Print all primes from 1 to 1010 in some order Available: A machine with 10 processors Goal: Achieve speed up factor of 10, that is, new time to print all primes should be one tenth of time for single processor. We will partition the inputs among the 10 processors. But what is the best way to do this?Testing Many Primes Concurrently: Testing Many Primes Concurrently P1 P2 P10 1 109 2x109 1010 Simplest way to split the inputs among processors: Each processor Pi gets 109 consecutive numbers to test. … …Testing Many Primes Concurrently:Code for Simplest Partition: Testing Many Primes Concurrently: Code for Simplest Partition (define (P i) (let ((counter (+ 1 (* (- i 1) (power 10 9)))) (upto (* i (power 10 9)))) (define (iter) (if (< counter upto) (begin (if (prime? counter) (display counter) #f) (increment-counter) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10))Load (Un)balancing: work is split unevenly: Load (Un)balancing: work is split unevenly Some processors have less primes to test… Some composite numbers are easier to test… P1 P2 P10 1 109 2x109 1010 Even this rather simple problem is hard to treat analytically. To achieve load balancing, we need to split the work range dynamically!A Shared Counter Object: A Shared Counter Object (define (make-shared-counter value) (define (fetch) value) (define (increment) (set! value (+ 1 value)) (define (dispatch m) (cond (((eq? m 'fetch) (fetch)) (eq? m 'increment) (increment)) (else (error “unknown request”)))) dispatch) (define shared-counter (make-shared-counter 1)) This shared counter object can be accessed by two operations: “fetch” (read) and “increment” (restricted form of write). Using the Shared Counter: Using the Shared Counter (define (P i) (define (iter) (let ((index (shared-counter 'fetch))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (shared-counter 'increment) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10))This Solution Doesn’t Work: This Solution Doesn’t WorkIt Will Never Work!? : It Will Never Work!? Real World Example: Walking in the Street Look / Move Fetch / Increment Computers: Accessing Memory Fischer Lynch & Paterson: Impossible to solve!!! Need to “glue” the Fetch / Increment pair into one indivisible operation: Fetch-and-IncrementThe Fetch-and-Increment Operation: The Fetch-and-Increment Operation (define (make-shared-counter value) (define (fetch-and-increment) (let ((old value)) (set! value (+ old 1)) old)) (define (dispatch m) (cond (((eq? m 'fetch-and-increment) (fetch-and-increment)) (else (error ``unknown request -- counter'' m)))) dispatch) Instantaneous Shared Counter Fetch-and-incA Correct Shared Counter: A Correct Shared Counter (define shared-counter (make-shared-counter 1)) (define (P i) (define (iter) (let ((index (shared-counter 'fetch-and-increment))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10)) Implementing Fetch-and-Inc: Implementing Fetch-and-Inc To make the program work we need an “instantaneous” implementation of fetch-and-increment. How can we do this: Special Hardware. Built-in synchronization instructions. Special Software. Use regular instructions -- the solution will involve waiting. Software: Mutual Exclusion Mutual Exclusion: Mutual Exclusion (mutex 'start) (let ((old value)) (set! value (+ old 1)) old) (mutex 'end)) Only one process at a time can execute these instructions P1 P2 P10 1 1 P2 returns 1 Mutex countThe Story of Alice and Bob(these Alice and Bob are unrelated to Rivest, Shamir and Adelman, the new recipients of the Turning award) : The Story of Alice and Bob (these Alice and Bob are unrelated to Rivest, Shamir and Adelman, the new recipients of the Turning award) * As told by Leslie LamportThe Mutual Exclusion Problem: The Mutual Exclusion Problem Requirements: Mutual Exclusion: there will never be two dogs simultaneously in the yard. No Deadlock: if only one dog wants to be in the yard it will succeed, and if both dogs want to go out, at least one of them will succeed.Cell Phone Solution: Cell Phone SolutionCoke Can Solution: Coke Can SolutionFlag Solution -- Alice: Flag Solution -- AliceFlag Solution -- Bob: Flag Solution -- BobFlag Solution -- Both: Flag Solution -- BothIntuition: Why Mutual Exclusion is Preserved: Intuition: Why Mutual Exclusion is Preserved Each perform: First raise the flag, to signal interest. Then look to see if the other one has raised the flag. One can claim that the following flag principle holds: since Alice and Bob each raise their own flag and then look at the others flag, the last one to start looking must notice that both flags are up. Why is there no Deadlock?: Why is there no Deadlock? Since Alice has priority over Bob…if neither is entering the critical section, both are repeatedly trying, and Bob will give Alice priority. Unfortunately, the algorithm is not a fair one, and Bob's dogs might eventually grow very anxious :-)The Morals of our Story: The Morals of our Story The Mutual Exclusion problem cannot be solved using transient communication. (I.e. Cell-phones.) The Mutual Exclusion problem cannot be solved using interrupts or interrupt bits (I.e. Cans) The Mutual Exclusion problem can be solved with one bit registers (I.e. Flags), memory locations that can be read and written (set!-ed). We cheated a little: the arbiter problem… While moving, is the flag really up or down?The Solution and Conclusion: The Solution and Conclusion (define (Alice) (loop (mutex 'begin) (Alice-dog-in-yard) ;; critical section (mutex 'end) )) Question: then why not execute all the code of the parallel prime-printing algorithm in a critical section? Answer: Amdahl’s Law: Answer: Amdahl’s Law Speedup Overall = Execution_time_Old / Execution_time_New = 1 Fraction_Enhanced Speedup_Enhanced 1- Fraction_Enhanced + If 40% of the execution time can be sped-up by a factor of 10 using 10 processors. Then Overall Speedup = 1/(0.6 + (0.4/10)) = 1/0.64 = 1.56 Length of Critical Sections : Length of Critical Sections We want Linear Speedup: 100 processors = 100 times faster By Amdahl's law that means that Fraction_Enhanced = 1 No Sequential Parts!!!!! For a speedup of only 99 times: 1-Fraction_Enhanced= 0.0001 In other words, we must try and make the parts that are sequential (the critical sections) as short as possible!!!!! Summarizing it all: Summarizing it all To summarize it all, our world is asynchronous, and yet with a bit of luck we have ways of overcoming this asynchrony to create powerful concurrent algorithms. “Life is the synchronicity of chance”