mapld

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Early output logic and Anti-Tokens: 

Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University

Overview: 

Overview Synchronous Problems Asynchronous Logic Why? How? Solutions Early Output Anti-Tokens

Problems: Communication: 

Problems: Communication Communication horizon “For a 60 nanometer process a signal can reach only 5% of the die’s length in a clock cycle” [D. Matzke,1997] Clock distributed using wave pipelining

Problems: Performance: 

Problems: Performance Cycle time Unbalanced Stages Clock Skew/Jitter Transistor Variability Signal Integrity Worst – Average case performance Real Computation Clock overheads Timing Assumption overheads

Clock! What is it good for?: 

Clock! What is it good for? No arguing with the clock 9am - 5pm. No excuses!

Bundled-Data: 

Bundled-Data When you finish, do the next task Flexitime Request + Delay Acknowledge

How do you know when you are finished?: 

How do you know when you are finished? Synchronous: Estimate Global timing reference Asynchronous (bundled-data) Estimate Local delay elements Asynchronous (delay-insensitive) When the data arrives Intrinsic

Becoming Delay Insensitive: 

Becoming Delay Insensitive Dual-Rail Two wires 00 – NULL 01 – Zero 10 – One (11 – Not used) Four Phase handshake Return to zero R1 Ack R0

Early Output Logic: 

Early Output Logic Dual-Rail interfaces Output generated as early as possible Two Early output cases If either input is ‘0’ then the output is ‘0’

Bit level pipelining: 

Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to

Bit level pipelining: 

Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to

Bit level pipelining: 

Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to

Early Output cases: 

Early Output cases

Validity: 

Validity Unnecessary late inputs Must be acknowledged Must wait until they arrive Validity signal Latch generated Ready to be acknowledged Result before all inputs present Acknowledge after all inputs present

Synchronisation Hurts: 

Synchronisation Hurts No need to wait before generating result Need to wait for input in order to acknowledge it Unnecessary stall

Anti-Tokens: 

Anti-Tokens Unnecessary late inputs Stall the entire stage Proactive approach Send a ‘cancel’ signal backward to the source Acknowledge before data arrives Anti-Token latches Assert validity early

Anti-token generation: 

Anti-token generation 0 1 C

Anti-token generation: 

Anti-token generation 0 A 1 C

Anti-token Propagation: 

Anti-token Propagation 1 C A

Anti-token Propagation: 

Anti-token Propagation 1 C A A

Anti-token Token collisions: 

Anti-token Token collisions 1 1 A A 1 1 A A ? A ? 1

Anti-token Token collisions: 

Anti-token Token collisions 1 1 A 1 1 A A 1 A 1 1 1

Remove Unnecessary computation: 

Remove Unnecessary computation Cycle time Unbalanced Stages Clock Skew/Jitter Transistor Variability Signal Integrity Worst – Average case performance Real Computation Clock overheads Timing Assumption overheads Unnecessary Computation/Delays

Summary: 

Summary Asynchronous Delay Insensitive Safe No timing assumptions Average case performance Remove unnecessary computation Anti-tokens without mutual exclusion units