logging in or signing up Transaction1 Durante Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 3609 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 21, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... By: artiattri (19 month(s) ago) how to download transaction processing lecturer ppt.....plz allow me to download this Saving..... Post Reply Close Saving..... Edit Comment Close By: dinahafiz (26 month(s) ago) hi i need it Saving..... Post Reply Close Saving..... Edit Comment Close By: mukundugupta (26 month(s) ago) hi, i need this presentation. Saving..... Post Reply Close Saving..... Edit Comment Close By: haidy1122006 (37 month(s) ago) i wnat transaction1 presentation plz Saving..... Post Reply Close Saving..... Edit Comment Close By: dpak_r (43 month(s) ago) how to download transaction processing lecturer notes Saving..... Post Reply Close Saving..... Edit Comment Close loading.... See all Premium member Presentation Transcript Transaction Processing Concepts: Transaction Processing Concepts Chapter 17 & 18 from Elmasri and Navathe’s book Lecture NotesOutline of Lecture: Outline of Lecture Introduction to Transaction Processing Problems and Solution Concepts Properties Why Do We Need Transactions?: Why Do We Need Transactions? It’s all about fast query response time and correctness DBMS is a multi-user systems Many different requests Some against same data items Figure out how to interleave requests to shorten response time while guaranteeing correct result How does DBMS know which actions belong together? Solution: Group database operations that must be performed together into transactions Either execute all operations or noneConcurrent Transactions: Concurrent Transactions interleaved processing parallel processing t1 t2 t1 t2 time CPU1 CPU2 A B A B CPU1Terminology: Terminology A transaction T is a logical unit of database processing that includes one or more database access operations Embedded within application program Specified interactively (e.g., via SQL) Transaction boundaries: Begin/end transaction Read vs. write transactionSample Transaction (informal): Sample Transaction (informal) Example: Move $40 from checking to savings account To user, appears as one activity To database: Read balance of checking account: read( X) Read balance of savings account: read (Y) Subtract $40 from X Add $40 to Y Write new value of X back to disk Write new value of Y back to diskAnother Sample Transaction: Another Sample Transaction Reserving a seat for a flight If concurrent access to data in DBMS, two users may try to book the same seat simultaneously Agent 1 finds seat 35G empty Agent 2 finds seat 35G empty time Agent 1 sets seat 35G occupied Agent 2 sets seat 35G occupiedTerminology: Terminology Basic access operations: read_item(X) write_item(X) How are read_item or write_item implemented? Read-set of T: all items that transaction reads Write-set of T: all items that transaction writesSample Transaction (Formal): Sample Transaction (Formal) T1 read_item(X); read_item(Y); X:=X-40; Y:=Y+40; write _item(X); write_item(Y); t0 tkWhat Can Go Wrong?: What Can Go Wrong? Several problems can occur when concurrent transactions execute without control Illustrate several scenarios on the next slidesLost Update Problem : Lost Update Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); timeTemporary Update (Dirty Read) : Temporary Update (Dirty Read) T1 read_item(X); X:=X-N; write_item(X); read_item(Y); T1 fails and aborts T2 read_item(X); X:=X+M; write_item(X); timeIncorrect Summary Problem : Incorrect Summary Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y=Y+N Write_item(Y) T2 sum:=0; read_item(A); sum:=sum+A; read_item(X); sum:=sum+X; read_item(Y); sum:=sum+Y timeWhat Can Go Wrong?: What Can Go Wrong? System may crash before data is written back to disk Problem of atomicity Some other transaction is modifying shared data while our transaction is ongoing (or vice versa) Problem of serialization and isolation System may not be able to obtain one or more of the data items System may not be able to write one or more of the data items Also problems of atomicity DBMS has a Concurrency Control subsytem to assure database remains in consistent state despite concurrent execution of transactionsOther Problems: Other Problems System failures may occur Types of failures: System crash Transaction or system error Local errors Concurrency control enforcement Disk failure Physical failures DBMS has a Recovery Subsystem to protect database against system failuresTransaction States: Transaction States BEGIN_TRANSACTION: marks start of transaction READ or WRITE: two possible operations on the data END_TRANSACTION: marks the end of the read or write operations; start checking whether everything went according to plan COMIT_TRANSACTION: signals successful end of transaction; changes can be “committed” to DB ROLLBACK (or ABORT): signals unsuccessful end of transaction, changes applied to DB must be undoneState Diagram: State Diagram Active Partially Committed Committed Terminated Failed BEGIN TRANSACTION READ, WRITE END TRANSACTION COMMIT ABORT ABORTSystem Log: System Log Remember, DBMS must assure that we don’t loose information due to system crashes i.e., how do we recover from failure? Keep system log Kept on disk, backed up periodically Record every action [start_transaction, T] [write_item, T, X, old, new] [read_item, T, X] [commit, T] [abort, T]How is the Log File Used?: How is the Log File Used? All permanent changes to data are recorded Possible to undo changes to data After crash, search log backwards until find last commit point Know that beyond this point, effects of transaction are permanently recorded Need to either redo or undo everything that happened since last commit point Undo: When transaction only partially completed (before crash) Redo: Transaction completed but we are unsure whether data was written to diskACID Properties of Transactions: ACID Properties of Transactions Atomicity: Transaction is either performed in its entirety or not performed at all Task of the recovery subsystem to enforce atomicity Consistency preservation: Transaction must take the database from one consistent state to another Users/DBMS: enforce implicit and explicit constraints Isolation: Transaction should appear as though it is being executed in isolation from other transaction Enforced by concurrency control subsystem Durability: Changes applied to the database by a committed transaction must persist Enforced by recovery subsystemTransaction Schedules: Transaction Schedules Lecture NotesOutline of Lecture: Outline of Lecture Transaction Schedules Recoverable Schedules Serializability Definition Test Transaction Schedule: Transaction Schedule Recall: Multiple transactions can be executed concurrently by interleaving their operations Ordering of execution of operations from various transactions T1, T2, … , Tn is called a schedule S Constraint: For each transaction Ti, the order in which operations occur in S must the same as in Ti Only interested in read (r), write (w), commit (c), abort (a)Sample Schedule: Sample Schedule T1: r(X); w(X); r(Y); w(Y); c T2: r(X); w(X); c Sample schedule: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2 Conflicts: Conflicts Two operations conflict if they satisfy ALL three conditions: they belong to different transactions AND they access the same item AND at least one is a write_item()operation Ex.: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); conflictsWhy Do We Interleave Transactions?: Why Do We Interleave Transactions? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); Could be a long wait S is a serial schedule – no interleaving! Schedule SSerial Schedule: Serial Schedule If we consider transactions to be independent, serial schedule is correct Based on C property in ACID above assumption is valid Furthermore, it does not matter which transaction is executed first, as long as every transaction is executed in its entirety, from beginning to end Assume X=90, Y=90, N=3, M=2, then result of schedule S is X=89 and Y= 93 Same result if we start with T2 Better?: Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S’ is a non-serial schedule T2 will be done faster but is the result correct? Schedule S’Concurrent Executions: Concurrent Executions Serial execution is by far simplest method to execute transactions No extra work ensuring consistency Inefficient! Reasons for concurrency: Increased throughput Reduces average response time Need concept of correct concurrent execution Using same X, Y, N, M values as before, result of S is X=92 and Y=93 (not correct)Better?: Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S” is a non-serial schedule Produces same result as serial schedule S Schedule S”Serializability: Serializability Assumption: Every serial schedule is correct Goal: Like to find non-serial schedules which are also correct A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactions When are two schedules equivalent? Option 1: They lead to same result (result equivalent) Option 2: The order of any two conflicting operations is the same (conflict equivalent)Result Equivalent Schedules: Result Equivalent Schedules Two schedules are result equivalent if they produce the same final state of the database Problem: May produce same result by accident! S1 read_item(X); X:=X+10; write_item(X); S2 read_item(X); X:=X*1.1; write_item(X); Schedules S1 and S2 are result equivalent for X=100 but not in generalConflict Equivalent Schedules: Conflict Equivalent Schedules Two schedules are conflict equivalent, if the order of any two conflicting operations is the same in both schedulesConflict Equivalence: Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); order matters order doesn’t matter order matters Serial Schedule S1 order doesn’t matterConflict Equivalence: Conflict Equivalence T1 read_item(A); read_item(B); write_item(A); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); S1 and S1’ are conflict equivalent (S1’ produces the same result as S1) Schedule S1’ same order as in S1 same order as in S1Conflict Equivalence: Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); Schedule S1’’ is not conflict equivalent to S1 (produces a different result than S1) Schedule S1’’ different order as in S1 different order as in S1Conflict Serializable: Conflict Serializable Schedule S is conflict serializable if it is conflict equivalent to some serial schedule S’ Can reorder the non-conflicting operations to improve efficiency Non-conflicting operations: Reads and writes from same transaction Reads from different transactions Reads and writes from different transactions on different data items Conflicting operations: Reads and writes from different transactions on same data itemExample: Example T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); Schedule A Schedule B B is conflict equivalent to A B is serializableTest for Serializability: Test for Serializability Construct a directed graph, precedence graph, G = (V, E) V: set of all transactions participating in schedule E: set of edges Ti Tj for which one of the following holds: Ti executes a write_item(X) before Tj executes read_item(X) Ti executes a read_item(X) before Tj executes write_item(X) Ti executes a write_item(X) before Tj executes write_item(X) An edge Ti Tj means that in any serial schedule equivalent to S, Ti must come before Tj If G has a cycle, than S is not conflict serializable If not, use topological sort to obtain serialiazable schedule (linear order consistent with precedence order of graph)Sample Schedule S: Sample Schedule S T1 read_item(X); write_item(X); read_item(Y); write_item(Y); T2 read_item(Z); read_item(Y); write_item(Y); read_item(X); write_item(X); T3 read_item(Y); read_item(Z); write_item(Y); write_item(Z);Precedence Graph for S: Precedence Graph for S T1 T2 T3 X,Y Y Y,Z Equivalent Serial Schedule: T3 T1 T2 (precedence order) no cycles S is serializable Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Being serializable is not the same as being serial Being serializable implies that the schedule is a correct schedule. It will leave the database in a consistent state. The interleaving is appropriate and will result in a state as if the transactions were serially executed, yet will achieve efficiency due to concurrent execution. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Serializability is hard to check. Interleaving of operations occurs in an operating system through some scheduler Difficult to determine beforehand how the operations in a schedule will be interleaved.Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Practical approach: Come up with methods (protocols) to ensure serializability. It’s not possible to determine when a schedule begins and when it ends. Hence, we reduce the problem of checking the whole schedule to checking only a committed project of the schedule (i.e. operations from only the committed transactions.) Current approach used in most DBMSs: Use of locks with two phase lockingCharacterizing Schedules based on Serializability : Characterizing Schedules based on Serializability View equivalence: A less restrictive definition of equivalence of schedules View serializability: definition of serializability based on view equivalence. A schedule is view serializable if it is view equivalent to a serial schedule. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Two schedules are said to be view equivalent if the following three conditions hold: The same set of transactions participates in S and S’, and S and S’ include the same operations of those transactions. For any operation Ri(X) of Ti in S, if the value of X read by the operation has been written by an operation Wj(X) of Tj (or if it is the original value of X before the schedule started), the same condition must hold for the value of X read by operation Ri(X) of Ti in S’. If the operation Wk(Y) of Tk is the last operation to write item Y in S, then Wk(Y) of Tk must also be the last operation to write item Y in S’. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability The premise behind view equivalence: As long as each read operation of a transaction reads the result of the same write operation in both schedules, the write operations of each transaction musr produce the same results. “The view”: the read operations are said to see the the same view in both schedules. Characterizing Schedules based on Serializability: Characterizing Schedules based on Serializability Relationship between view and conflict equivalence: The two are same under constrained write assumption which assumes that if T writes X, it is constrained by the value of X it read; i.e., new X = f(old X) Conflict serializability is stricter than view serializability. With unconstrained write (or blind write), a schedule that is view serializable is not necessarily conflict serialiable. Any conflict serializable schedule is also view serializable, but not vice versa. Characterizing Schedules based on Serializability: Characterizing Schedules based on Serializability Relationship between view and conflict equivalence (cont): Consider the following schedule of three transactions T1: r1(X), w1(X); T2: w2(X); and T3: w3(X): Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3; In Sa, the operations w2(X) and w3(X) are blind writes, since T1 and T3 do not read the value of X. Sa is view serializable, since it is view equivalent to the serial schedule T1, T2, T3. However, Sa is not conflict serializable, since it is not conflict equivalent to any serial schedule.Concurrency Control Techniques: Concurrency Control Techniques Lecture NotesOutline of Lecture: Outline of Lecture Concurrency Based on Locking Techniques Locks 2-Phase Locking Protocol Remember ACID Properties?: Remember ACID Properties? Atomicity enforced by transaction recovery subsystem Consistency enforced by application programmers and part of DBMS that enforces integrity constraints Isolation enforced by concurrency control system Durability enforced by transaction recovery subsystem Now, focus on concurrency control Ensure serializability of schedules through protocols that result in serializable schedulesQuick Recap: Quick Recap Difference between serial and serializable Real DBMS does not test for serializability Very inefficient since transactions are continuously arriving Would require a lot of undoing Solution: concurrency protocols If followed by every transaction, and if enforced by transaction processing system, guarantee serializability of schedules in which transaction appearsConcurrency Control Through Locks: Concurrency Control Through Locks Lock: variable associated with each data item Describes status of item wrt operations that can be performed on it Binary locks: Locked/unlocked Enforces mutual exclusion Multiple-mode locks: Read/write a.k.a. Shared/Exclusive Three operations read_lock(X) write_lock(X) unlock(X) Each data item can be in one of three lock statesImplementation: Implementation Maintain lock table Keep track of locked items and their locks <data item, LOCK, no_of_reads, locking_transaction> For read locks, keep track of the number of transactions that hold a read lock on an itemLocking Rules: Locking Rules T must issue read_lock(X) or write_lock(X) before any read_item(X) op is performed in T T must issue write_lock(X) before any write_item(X) op is performed in T T must issue unlock(X) after all read_item(X) and write_item(X) ops are completed in T T will not issue a read_lock(X) if it already holds a read lock or write lock on X (may be relaxed) T will not issue a write_lock(X) if it already holds a read lock or write lock on X (may be relaxed)Lock Conversions: Lock Conversions Sometimes beneficial to relax locking rules 4 and 5 Upgrade read lock on X to a write lock (by issuing a write_lock(X) ) Only possible if T is the only transaction holding a read lock on X Downgrade a write lock by issuing a read_lock(X) Must be noted in lock tableGranting of Locks: Granting of Locks Suppose T2 has read-lock on item X T1 is requesting write-lock on item X; needs to wait for T2 to release T3 requests read-lock on X; request is granted Assume shortly thereafter T2 relinquishes read-lock Continue scenario through a sequence of transactions all requesting read-lock on X T1 will never make progress T1 is said to be starvedGranting of Locks: Granting of Locks How do you avoid starvation in the presence of locks? Assume Ti requesting lock on Q Grant lock provided that No locking conflict with lock requested by Ti, OR No other transaction waiting for lock and made request before TiTwo Transactions: Two Transactions T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Let’s assume serial schedule S1: T1;T2 Initial values: X=20, Y=30 Result: X=50, Y=80 Locks Alone Don’t Do the Trick!: Locks Alone Don’t Do the Trick! T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Non-serializable! Result: X=50, Y=50 unlocked too early! Let’s run T1 and T2 in interleafed fashion Schedule STwo-Phase Locking (2PL): Two-Phase Locking (2PL) Def.: Transaction is said to follow the two-phase-locking protocol if all locking operations precede the first unlock operation Expanding (growing) = first phase Shrinking = second phase During the shrinking phase no new locks can be acquired! Downgrading ok Upgrading is not Example: Example T1’ read_lock(Y); read_item(Y); write_lock(X); unlock(Y); read_item(X); X:=X+Y; write_item(X); unlock(X); T2’ read_lock(X); read_item(X); write_lock(Y); unlock(X); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Both T1’ and T2’ follow the 2PL protocol Any schedule including T1’ and T2’ is guaranteed to be serializable Limits the amount of concurrency Variations to the Basic Protocol: Variations to the Basic Protocol Previous technique knows as basic 2PL Conservative 2PL (static) 2PL: Lock all items needed BEFORE execution begins by predeclaring its read and write set If any of the items in read or write set is already locked (by other transactions), transaction waits (does not acquire any locks) Deadlock free but not very realisticVariations to the Basic Protocol: Variations to the Basic Protocol Strict 2PL: Transaction does not release its write locks until AFTER it aborts/commits Not deadlock free but guarantees recoverable schedules (strict schedule: transaction can neither read/write X until last transaction that wrote X has committed/aborted) Most popular variation of 2PLVariations to the Basic Protocol: Variations to the Basic Protocol Rigorous 2PL: No lock is released until after abort/commit Transaction is in its expanding phase until it endsConcluding Remarks: Concluding Remarks Concurrency control subsystem is responsible for inserting locks at right places into your transaction Strict 2PL is widely used Requires use of waiting queue All 2PL locking protocols guarantee serializability Does not permit all possible serial schedules Conservative and rigorous 2PL charge a high price for serializability However, deadlock-based algorithms may suffer from starvation and deadlock (see next lecture)Concurrency Control Techniques: Concurrency Control Techniques Lecture NotesOutline of Lecture: Outline of Lecture Two Problems with Locks Deadlock Starvation Concurrency Control Based on Timestamp Ordering What is Deadlock?: What is Deadlock? Occurs when each transaction Ti in a set of two or more is waiting on an item locked by some other transaction Tj in the set T1 read_lock(Y); read_item(Y); write_lock(X); T2 read_lock(X); read_item(X); write_lock(Y); T1 T2 SDeadlock Prevention: Deadlock Prevention Locking as deadlock prevention leads to very inefficient schedules (e.g., conservative 2PL) Better, use transaction timestamp TS(T) TS is unique identifier assigned to each transaction if T1 starts before T2, then TS(T1) < TS(T2) (older has smaller timestamp value) Wait-die and wound-wait schemesWait-Die Scheme: Wait-Die Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), then Ti is allowed to wait Otherwise, Ti younger than Tj, abort Ti (Ti dies) and restart later with SAME timestamp Older transaction is allowed to wait on younger transaction Younger transaction requesting an item held by older transaction is aborted and restartedWound-Wait Scheme: Wound-Wait Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), abort Tj (Ti wounds Tj) and restart later with SAME timestamp Otherwise, Ti younger than Tj, Ti is allowed to wait Younger transaction is allowed to wait on older transaction Older transaction requesting item held by younger transaction preempts younger one by aborting it Both schemes abort younger transaction that may be involved in deadlock Both deadlock free but may cause needless abortsMore Deadlock Prevention: More Deadlock Prevention Waiting schemes (require no timestamps) No waiting: if transaction cannot obtain lock, aborted immediately and restarted after time t needless restarts Cautious waiting: Suppose Ti tries to lock item X which is locked by Tj If Tj is not blocked, Ti is blocked and allowed to wait O.w. abort Ti Cautious waiting is deadlock-freeDeadlock Detection: Deadlock Detection DBMS checks if deadlock has occurred Works well if few short transactions with little interference O.w., use deadlock prevention Two approaches to deadlock detection: Wait-for graph If cycle, abort one of the transactions (victim selection) TimeoutsStarvation: Starvation Transaction cannot continue for indefinite amount of time while others proceed normally When? Unfair waiting scheme with priorities for certain transactions E.g., in deadlock detection, if we choose victim always based on cost factors, same transaction may always be picked as victim Include rollbacks in cost factor You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Transaction1 Durante Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 3609 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 21, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... By: artiattri (19 month(s) ago) how to download transaction processing lecturer ppt.....plz allow me to download this Saving..... Post Reply Close Saving..... Edit Comment Close By: dinahafiz (26 month(s) ago) hi i need it Saving..... Post Reply Close Saving..... Edit Comment Close By: mukundugupta (26 month(s) ago) hi, i need this presentation. Saving..... Post Reply Close Saving..... Edit Comment Close By: haidy1122006 (37 month(s) ago) i wnat transaction1 presentation plz Saving..... Post Reply Close Saving..... Edit Comment Close By: dpak_r (43 month(s) ago) how to download transaction processing lecturer notes Saving..... Post Reply Close Saving..... Edit Comment Close loading.... See all Premium member Presentation Transcript Transaction Processing Concepts: Transaction Processing Concepts Chapter 17 & 18 from Elmasri and Navathe’s book Lecture NotesOutline of Lecture: Outline of Lecture Introduction to Transaction Processing Problems and Solution Concepts Properties Why Do We Need Transactions?: Why Do We Need Transactions? It’s all about fast query response time and correctness DBMS is a multi-user systems Many different requests Some against same data items Figure out how to interleave requests to shorten response time while guaranteeing correct result How does DBMS know which actions belong together? Solution: Group database operations that must be performed together into transactions Either execute all operations or noneConcurrent Transactions: Concurrent Transactions interleaved processing parallel processing t1 t2 t1 t2 time CPU1 CPU2 A B A B CPU1Terminology: Terminology A transaction T is a logical unit of database processing that includes one or more database access operations Embedded within application program Specified interactively (e.g., via SQL) Transaction boundaries: Begin/end transaction Read vs. write transactionSample Transaction (informal): Sample Transaction (informal) Example: Move $40 from checking to savings account To user, appears as one activity To database: Read balance of checking account: read( X) Read balance of savings account: read (Y) Subtract $40 from X Add $40 to Y Write new value of X back to disk Write new value of Y back to diskAnother Sample Transaction: Another Sample Transaction Reserving a seat for a flight If concurrent access to data in DBMS, two users may try to book the same seat simultaneously Agent 1 finds seat 35G empty Agent 2 finds seat 35G empty time Agent 1 sets seat 35G occupied Agent 2 sets seat 35G occupiedTerminology: Terminology Basic access operations: read_item(X) write_item(X) How are read_item or write_item implemented? Read-set of T: all items that transaction reads Write-set of T: all items that transaction writesSample Transaction (Formal): Sample Transaction (Formal) T1 read_item(X); read_item(Y); X:=X-40; Y:=Y+40; write _item(X); write_item(Y); t0 tkWhat Can Go Wrong?: What Can Go Wrong? Several problems can occur when concurrent transactions execute without control Illustrate several scenarios on the next slidesLost Update Problem : Lost Update Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); timeTemporary Update (Dirty Read) : Temporary Update (Dirty Read) T1 read_item(X); X:=X-N; write_item(X); read_item(Y); T1 fails and aborts T2 read_item(X); X:=X+M; write_item(X); timeIncorrect Summary Problem : Incorrect Summary Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y=Y+N Write_item(Y) T2 sum:=0; read_item(A); sum:=sum+A; read_item(X); sum:=sum+X; read_item(Y); sum:=sum+Y timeWhat Can Go Wrong?: What Can Go Wrong? System may crash before data is written back to disk Problem of atomicity Some other transaction is modifying shared data while our transaction is ongoing (or vice versa) Problem of serialization and isolation System may not be able to obtain one or more of the data items System may not be able to write one or more of the data items Also problems of atomicity DBMS has a Concurrency Control subsytem to assure database remains in consistent state despite concurrent execution of transactionsOther Problems: Other Problems System failures may occur Types of failures: System crash Transaction or system error Local errors Concurrency control enforcement Disk failure Physical failures DBMS has a Recovery Subsystem to protect database against system failuresTransaction States: Transaction States BEGIN_TRANSACTION: marks start of transaction READ or WRITE: two possible operations on the data END_TRANSACTION: marks the end of the read or write operations; start checking whether everything went according to plan COMIT_TRANSACTION: signals successful end of transaction; changes can be “committed” to DB ROLLBACK (or ABORT): signals unsuccessful end of transaction, changes applied to DB must be undoneState Diagram: State Diagram Active Partially Committed Committed Terminated Failed BEGIN TRANSACTION READ, WRITE END TRANSACTION COMMIT ABORT ABORTSystem Log: System Log Remember, DBMS must assure that we don’t loose information due to system crashes i.e., how do we recover from failure? Keep system log Kept on disk, backed up periodically Record every action [start_transaction, T] [write_item, T, X, old, new] [read_item, T, X] [commit, T] [abort, T]How is the Log File Used?: How is the Log File Used? All permanent changes to data are recorded Possible to undo changes to data After crash, search log backwards until find last commit point Know that beyond this point, effects of transaction are permanently recorded Need to either redo or undo everything that happened since last commit point Undo: When transaction only partially completed (before crash) Redo: Transaction completed but we are unsure whether data was written to diskACID Properties of Transactions: ACID Properties of Transactions Atomicity: Transaction is either performed in its entirety or not performed at all Task of the recovery subsystem to enforce atomicity Consistency preservation: Transaction must take the database from one consistent state to another Users/DBMS: enforce implicit and explicit constraints Isolation: Transaction should appear as though it is being executed in isolation from other transaction Enforced by concurrency control subsystem Durability: Changes applied to the database by a committed transaction must persist Enforced by recovery subsystemTransaction Schedules: Transaction Schedules Lecture NotesOutline of Lecture: Outline of Lecture Transaction Schedules Recoverable Schedules Serializability Definition Test Transaction Schedule: Transaction Schedule Recall: Multiple transactions can be executed concurrently by interleaving their operations Ordering of execution of operations from various transactions T1, T2, … , Tn is called a schedule S Constraint: For each transaction Ti, the order in which operations occur in S must the same as in Ti Only interested in read (r), write (w), commit (c), abort (a)Sample Schedule: Sample Schedule T1: r(X); w(X); r(Y); w(Y); c T2: r(X); w(X); c Sample schedule: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2 Conflicts: Conflicts Two operations conflict if they satisfy ALL three conditions: they belong to different transactions AND they access the same item AND at least one is a write_item()operation Ex.: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); conflictsWhy Do We Interleave Transactions?: Why Do We Interleave Transactions? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); Could be a long wait S is a serial schedule – no interleaving! Schedule SSerial Schedule: Serial Schedule If we consider transactions to be independent, serial schedule is correct Based on C property in ACID above assumption is valid Furthermore, it does not matter which transaction is executed first, as long as every transaction is executed in its entirety, from beginning to end Assume X=90, Y=90, N=3, M=2, then result of schedule S is X=89 and Y= 93 Same result if we start with T2 Better?: Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S’ is a non-serial schedule T2 will be done faster but is the result correct? Schedule S’Concurrent Executions: Concurrent Executions Serial execution is by far simplest method to execute transactions No extra work ensuring consistency Inefficient! Reasons for concurrency: Increased throughput Reduces average response time Need concept of correct concurrent execution Using same X, Y, N, M values as before, result of S is X=92 and Y=93 (not correct)Better?: Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S” is a non-serial schedule Produces same result as serial schedule S Schedule S”Serializability: Serializability Assumption: Every serial schedule is correct Goal: Like to find non-serial schedules which are also correct A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactions When are two schedules equivalent? Option 1: They lead to same result (result equivalent) Option 2: The order of any two conflicting operations is the same (conflict equivalent)Result Equivalent Schedules: Result Equivalent Schedules Two schedules are result equivalent if they produce the same final state of the database Problem: May produce same result by accident! S1 read_item(X); X:=X+10; write_item(X); S2 read_item(X); X:=X*1.1; write_item(X); Schedules S1 and S2 are result equivalent for X=100 but not in generalConflict Equivalent Schedules: Conflict Equivalent Schedules Two schedules are conflict equivalent, if the order of any two conflicting operations is the same in both schedulesConflict Equivalence: Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); order matters order doesn’t matter order matters Serial Schedule S1 order doesn’t matterConflict Equivalence: Conflict Equivalence T1 read_item(A); read_item(B); write_item(A); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); S1 and S1’ are conflict equivalent (S1’ produces the same result as S1) Schedule S1’ same order as in S1 same order as in S1Conflict Equivalence: Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); Schedule S1’’ is not conflict equivalent to S1 (produces a different result than S1) Schedule S1’’ different order as in S1 different order as in S1Conflict Serializable: Conflict Serializable Schedule S is conflict serializable if it is conflict equivalent to some serial schedule S’ Can reorder the non-conflicting operations to improve efficiency Non-conflicting operations: Reads and writes from same transaction Reads from different transactions Reads and writes from different transactions on different data items Conflicting operations: Reads and writes from different transactions on same data itemExample: Example T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); Schedule A Schedule B B is conflict equivalent to A B is serializableTest for Serializability: Test for Serializability Construct a directed graph, precedence graph, G = (V, E) V: set of all transactions participating in schedule E: set of edges Ti Tj for which one of the following holds: Ti executes a write_item(X) before Tj executes read_item(X) Ti executes a read_item(X) before Tj executes write_item(X) Ti executes a write_item(X) before Tj executes write_item(X) An edge Ti Tj means that in any serial schedule equivalent to S, Ti must come before Tj If G has a cycle, than S is not conflict serializable If not, use topological sort to obtain serialiazable schedule (linear order consistent with precedence order of graph)Sample Schedule S: Sample Schedule S T1 read_item(X); write_item(X); read_item(Y); write_item(Y); T2 read_item(Z); read_item(Y); write_item(Y); read_item(X); write_item(X); T3 read_item(Y); read_item(Z); write_item(Y); write_item(Z);Precedence Graph for S: Precedence Graph for S T1 T2 T3 X,Y Y Y,Z Equivalent Serial Schedule: T3 T1 T2 (precedence order) no cycles S is serializable Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Being serializable is not the same as being serial Being serializable implies that the schedule is a correct schedule. It will leave the database in a consistent state. The interleaving is appropriate and will result in a state as if the transactions were serially executed, yet will achieve efficiency due to concurrent execution. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Serializability is hard to check. Interleaving of operations occurs in an operating system through some scheduler Difficult to determine beforehand how the operations in a schedule will be interleaved.Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Practical approach: Come up with methods (protocols) to ensure serializability. It’s not possible to determine when a schedule begins and when it ends. Hence, we reduce the problem of checking the whole schedule to checking only a committed project of the schedule (i.e. operations from only the committed transactions.) Current approach used in most DBMSs: Use of locks with two phase lockingCharacterizing Schedules based on Serializability : Characterizing Schedules based on Serializability View equivalence: A less restrictive definition of equivalence of schedules View serializability: definition of serializability based on view equivalence. A schedule is view serializable if it is view equivalent to a serial schedule. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability Two schedules are said to be view equivalent if the following three conditions hold: The same set of transactions participates in S and S’, and S and S’ include the same operations of those transactions. For any operation Ri(X) of Ti in S, if the value of X read by the operation has been written by an operation Wj(X) of Tj (or if it is the original value of X before the schedule started), the same condition must hold for the value of X read by operation Ri(X) of Ti in S’. If the operation Wk(Y) of Tk is the last operation to write item Y in S, then Wk(Y) of Tk must also be the last operation to write item Y in S’. Characterizing Schedules based on Serializability : Characterizing Schedules based on Serializability The premise behind view equivalence: As long as each read operation of a transaction reads the result of the same write operation in both schedules, the write operations of each transaction musr produce the same results. “The view”: the read operations are said to see the the same view in both schedules. Characterizing Schedules based on Serializability: Characterizing Schedules based on Serializability Relationship between view and conflict equivalence: The two are same under constrained write assumption which assumes that if T writes X, it is constrained by the value of X it read; i.e., new X = f(old X) Conflict serializability is stricter than view serializability. With unconstrained write (or blind write), a schedule that is view serializable is not necessarily conflict serialiable. Any conflict serializable schedule is also view serializable, but not vice versa. Characterizing Schedules based on Serializability: Characterizing Schedules based on Serializability Relationship between view and conflict equivalence (cont): Consider the following schedule of three transactions T1: r1(X), w1(X); T2: w2(X); and T3: w3(X): Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3; In Sa, the operations w2(X) and w3(X) are blind writes, since T1 and T3 do not read the value of X. Sa is view serializable, since it is view equivalent to the serial schedule T1, T2, T3. However, Sa is not conflict serializable, since it is not conflict equivalent to any serial schedule.Concurrency Control Techniques: Concurrency Control Techniques Lecture NotesOutline of Lecture: Outline of Lecture Concurrency Based on Locking Techniques Locks 2-Phase Locking Protocol Remember ACID Properties?: Remember ACID Properties? Atomicity enforced by transaction recovery subsystem Consistency enforced by application programmers and part of DBMS that enforces integrity constraints Isolation enforced by concurrency control system Durability enforced by transaction recovery subsystem Now, focus on concurrency control Ensure serializability of schedules through protocols that result in serializable schedulesQuick Recap: Quick Recap Difference between serial and serializable Real DBMS does not test for serializability Very inefficient since transactions are continuously arriving Would require a lot of undoing Solution: concurrency protocols If followed by every transaction, and if enforced by transaction processing system, guarantee serializability of schedules in which transaction appearsConcurrency Control Through Locks: Concurrency Control Through Locks Lock: variable associated with each data item Describes status of item wrt operations that can be performed on it Binary locks: Locked/unlocked Enforces mutual exclusion Multiple-mode locks: Read/write a.k.a. Shared/Exclusive Three operations read_lock(X) write_lock(X) unlock(X) Each data item can be in one of three lock statesImplementation: Implementation Maintain lock table Keep track of locked items and their locks <data item, LOCK, no_of_reads, locking_transaction> For read locks, keep track of the number of transactions that hold a read lock on an itemLocking Rules: Locking Rules T must issue read_lock(X) or write_lock(X) before any read_item(X) op is performed in T T must issue write_lock(X) before any write_item(X) op is performed in T T must issue unlock(X) after all read_item(X) and write_item(X) ops are completed in T T will not issue a read_lock(X) if it already holds a read lock or write lock on X (may be relaxed) T will not issue a write_lock(X) if it already holds a read lock or write lock on X (may be relaxed)Lock Conversions: Lock Conversions Sometimes beneficial to relax locking rules 4 and 5 Upgrade read lock on X to a write lock (by issuing a write_lock(X) ) Only possible if T is the only transaction holding a read lock on X Downgrade a write lock by issuing a read_lock(X) Must be noted in lock tableGranting of Locks: Granting of Locks Suppose T2 has read-lock on item X T1 is requesting write-lock on item X; needs to wait for T2 to release T3 requests read-lock on X; request is granted Assume shortly thereafter T2 relinquishes read-lock Continue scenario through a sequence of transactions all requesting read-lock on X T1 will never make progress T1 is said to be starvedGranting of Locks: Granting of Locks How do you avoid starvation in the presence of locks? Assume Ti requesting lock on Q Grant lock provided that No locking conflict with lock requested by Ti, OR No other transaction waiting for lock and made request before TiTwo Transactions: Two Transactions T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Let’s assume serial schedule S1: T1;T2 Initial values: X=20, Y=30 Result: X=50, Y=80 Locks Alone Don’t Do the Trick!: Locks Alone Don’t Do the Trick! T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Non-serializable! Result: X=50, Y=50 unlocked too early! Let’s run T1 and T2 in interleafed fashion Schedule STwo-Phase Locking (2PL): Two-Phase Locking (2PL) Def.: Transaction is said to follow the two-phase-locking protocol if all locking operations precede the first unlock operation Expanding (growing) = first phase Shrinking = second phase During the shrinking phase no new locks can be acquired! Downgrading ok Upgrading is not Example: Example T1’ read_lock(Y); read_item(Y); write_lock(X); unlock(Y); read_item(X); X:=X+Y; write_item(X); unlock(X); T2’ read_lock(X); read_item(X); write_lock(Y); unlock(X); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Both T1’ and T2’ follow the 2PL protocol Any schedule including T1’ and T2’ is guaranteed to be serializable Limits the amount of concurrency Variations to the Basic Protocol: Variations to the Basic Protocol Previous technique knows as basic 2PL Conservative 2PL (static) 2PL: Lock all items needed BEFORE execution begins by predeclaring its read and write set If any of the items in read or write set is already locked (by other transactions), transaction waits (does not acquire any locks) Deadlock free but not very realisticVariations to the Basic Protocol: Variations to the Basic Protocol Strict 2PL: Transaction does not release its write locks until AFTER it aborts/commits Not deadlock free but guarantees recoverable schedules (strict schedule: transaction can neither read/write X until last transaction that wrote X has committed/aborted) Most popular variation of 2PLVariations to the Basic Protocol: Variations to the Basic Protocol Rigorous 2PL: No lock is released until after abort/commit Transaction is in its expanding phase until it endsConcluding Remarks: Concluding Remarks Concurrency control subsystem is responsible for inserting locks at right places into your transaction Strict 2PL is widely used Requires use of waiting queue All 2PL locking protocols guarantee serializability Does not permit all possible serial schedules Conservative and rigorous 2PL charge a high price for serializability However, deadlock-based algorithms may suffer from starvation and deadlock (see next lecture)Concurrency Control Techniques: Concurrency Control Techniques Lecture NotesOutline of Lecture: Outline of Lecture Two Problems with Locks Deadlock Starvation Concurrency Control Based on Timestamp Ordering What is Deadlock?: What is Deadlock? Occurs when each transaction Ti in a set of two or more is waiting on an item locked by some other transaction Tj in the set T1 read_lock(Y); read_item(Y); write_lock(X); T2 read_lock(X); read_item(X); write_lock(Y); T1 T2 SDeadlock Prevention: Deadlock Prevention Locking as deadlock prevention leads to very inefficient schedules (e.g., conservative 2PL) Better, use transaction timestamp TS(T) TS is unique identifier assigned to each transaction if T1 starts before T2, then TS(T1) < TS(T2) (older has smaller timestamp value) Wait-die and wound-wait schemesWait-Die Scheme: Wait-Die Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), then Ti is allowed to wait Otherwise, Ti younger than Tj, abort Ti (Ti dies) and restart later with SAME timestamp Older transaction is allowed to wait on younger transaction Younger transaction requesting an item held by older transaction is aborted and restartedWound-Wait Scheme: Wound-Wait Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), abort Tj (Ti wounds Tj) and restart later with SAME timestamp Otherwise, Ti younger than Tj, Ti is allowed to wait Younger transaction is allowed to wait on older transaction Older transaction requesting item held by younger transaction preempts younger one by aborting it Both schemes abort younger transaction that may be involved in deadlock Both deadlock free but may cause needless abortsMore Deadlock Prevention: More Deadlock Prevention Waiting schemes (require no timestamps) No waiting: if transaction cannot obtain lock, aborted immediately and restarted after time t needless restarts Cautious waiting: Suppose Ti tries to lock item X which is locked by Tj If Tj is not blocked, Ti is blocked and allowed to wait O.w. abort Ti Cautious waiting is deadlock-freeDeadlock Detection: Deadlock Detection DBMS checks if deadlock has occurred Works well if few short transactions with little interference O.w., use deadlock prevention Two approaches to deadlock detection: Wait-for graph If cycle, abort one of the transactions (victim selection) TimeoutsStarvation: Starvation Transaction cannot continue for indefinite amount of time while others proceed normally When? Unfair waiting scheme with priorities for certain transactions E.g., in deadlock detection, if we choose victim always based on cost factors, same transaction may always be picked as victim Include rollbacks in cost factor