Transaction1

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

By: artiattri (19 month(s) ago)

how to download transaction processing lecturer ppt.....plz allow me to download this

By: dinahafiz (26 month(s) ago)

hi i need it

By: mukundugupta (26 month(s) ago)

hi, i need this presentation.

By: haidy1122006 (37 month(s) ago)

i wnat transaction1 presentation plz

By: dpak_r (43 month(s) ago)

how to download transaction processing lecturer notes

See all

Presentation Transcript

Transaction Processing Concepts: 

Transaction Processing Concepts Chapter 17 & 18 from Elmasri and Navathe’s book Lecture Notes

Outline of Lecture: 

Outline of Lecture Introduction to Transaction Processing Problems and Solution Concepts Properties

Why Do We Need Transactions?: 

Why Do We Need Transactions? It’s all about fast query response time and correctness DBMS is a multi-user systems Many different requests Some against same data items Figure out how to interleave requests to shorten response time while guaranteeing correct result How does DBMS know which actions belong together? Solution: Group database operations that must be performed together into transactions Either execute all operations or none

Concurrent Transactions: 

Concurrent Transactions interleaved processing parallel processing t1 t2 t1 t2 time CPU1 CPU2 A B A B CPU1

Terminology: 

Terminology A transaction T is a logical unit of database processing that includes one or more database access operations Embedded within application program Specified interactively (e.g., via SQL) Transaction boundaries: Begin/end transaction Read vs. write transaction

Sample Transaction (informal): 

Sample Transaction (informal) Example: Move $40 from checking to savings account To user, appears as one activity To database: Read balance of checking account: read( X) Read balance of savings account: read (Y) Subtract $40 from X Add $40 to Y Write new value of X back to disk Write new value of Y back to disk

Another Sample Transaction: 

Another Sample Transaction Reserving a seat for a flight If concurrent access to data in DBMS, two users may try to book the same seat simultaneously Agent 1 finds seat 35G empty Agent 2 finds seat 35G empty time Agent 1 sets seat 35G occupied Agent 2 sets seat 35G occupied

Terminology: 

Terminology Basic access operations: read_item(X) write_item(X) How are read_item or write_item implemented? Read-set of T: all items that transaction reads Write-set of T: all items that transaction writes

Sample Transaction (Formal): 

Sample Transaction (Formal) T1 read_item(X); read_item(Y); X:=X-40; Y:=Y+40; write _item(X); write_item(Y); t0 tk

What Can Go Wrong?: 

What Can Go Wrong? Several problems can occur when concurrent transactions execute without control Illustrate several scenarios on the next slides

Lost Update Problem : 

Lost Update Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); time

Temporary Update (Dirty Read) : 

Temporary Update (Dirty Read) T1 read_item(X); X:=X-N; write_item(X); read_item(Y); T1 fails and aborts T2 read_item(X); X:=X+M; write_item(X); time

Incorrect Summary Problem : 

Incorrect Summary Problem T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y=Y+N Write_item(Y) T2 sum:=0; read_item(A); sum:=sum+A; read_item(X); sum:=sum+X; read_item(Y); sum:=sum+Y time

What Can Go Wrong?: 

What Can Go Wrong? System may crash before data is written back to disk Problem of atomicity Some other transaction is modifying shared data while our transaction is ongoing (or vice versa) Problem of serialization and isolation System may not be able to obtain one or more of the data items System may not be able to write one or more of the data items Also problems of atomicity DBMS has a Concurrency Control subsytem to assure database remains in consistent state despite concurrent execution of transactions

Other Problems: 

Other Problems System failures may occur Types of failures: System crash Transaction or system error Local errors Concurrency control enforcement Disk failure Physical failures DBMS has a Recovery Subsystem to protect database against system failures

Transaction States: 

Transaction States BEGIN_TRANSACTION: marks start of transaction READ or WRITE: two possible operations on the data END_TRANSACTION: marks the end of the read or write operations; start checking whether everything went according to plan COMIT_TRANSACTION: signals successful end of transaction; changes can be “committed” to DB ROLLBACK (or ABORT): signals unsuccessful end of transaction, changes applied to DB must be undone

State Diagram: 

State Diagram Active Partially Committed Committed Terminated Failed BEGIN TRANSACTION READ, WRITE END TRANSACTION COMMIT ABORT ABORT

System Log: 

System Log Remember, DBMS must assure that we don’t loose information due to system crashes i.e., how do we recover from failure? Keep system log Kept on disk, backed up periodically Record every action [start_transaction, T] [write_item, T, X, old, new] [read_item, T, X] [commit, T] [abort, T]

How is the Log File Used?: 

How is the Log File Used? All permanent changes to data are recorded Possible to undo changes to data After crash, search log backwards until find last commit point Know that beyond this point, effects of transaction are permanently recorded Need to either redo or undo everything that happened since last commit point Undo: When transaction only partially completed (before crash) Redo: Transaction completed but we are unsure whether data was written to disk

ACID Properties of Transactions: 

ACID Properties of Transactions Atomicity: Transaction is either performed in its entirety or not performed at all Task of the recovery subsystem to enforce atomicity Consistency preservation: Transaction must take the database from one consistent state to another Users/DBMS: enforce implicit and explicit constraints Isolation: Transaction should appear as though it is being executed in isolation from other transaction Enforced by concurrency control subsystem Durability: Changes applied to the database by a committed transaction must persist Enforced by recovery subsystem

Transaction Schedules: 

Transaction Schedules Lecture Notes

Outline of Lecture: 

Outline of Lecture Transaction Schedules Recoverable Schedules Serializability Definition Test

Transaction Schedule: 

Transaction Schedule Recall: Multiple transactions can be executed concurrently by interleaving their operations Ordering of execution of operations from various transactions T1, T2, … , Tn is called a schedule S Constraint: For each transaction Ti, the order in which operations occur in S must the same as in Ti Only interested in read (r), write (w), commit (c), abort (a)

Sample Schedule: 

Sample Schedule T1: r(X); w(X); r(Y); w(Y); c T2: r(X); w(X); c Sample schedule: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2

Conflicts: 

Conflicts Two operations conflict if they satisfy ALL three conditions: they belong to different transactions AND they access the same item AND at least one is a write_item()operation Ex.: S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); conflicts

Why Do We Interleave Transactions?: 

Why Do We Interleave Transactions? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); Could be a long wait S is a serial schedule – no interleaving! Schedule S

Serial Schedule: 

Serial Schedule If we consider transactions to be independent, serial schedule is correct Based on C property in ACID above assumption is valid Furthermore, it does not matter which transaction is executed first, as long as every transaction is executed in its entirety, from beginning to end Assume X=90, Y=90, N=3, M=2, then result of schedule S is X=89 and Y= 93 Same result if we start with T2

Better?: 

Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S’ is a non-serial schedule T2 will be done faster but is the result correct? Schedule S’

Concurrent Executions: 

Concurrent Executions Serial execution is by far simplest method to execute transactions No extra work ensuring consistency Inefficient! Reasons for concurrency: Increased throughput Reduces average response time Need concept of correct concurrent execution Using same X, Y, N, M values as before, result of S is X=92 and Y=93 (not correct)

Better?: 

Better? T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X): X:=X+M; write_item(X); S” is a non-serial schedule Produces same result as serial schedule S Schedule S”

Serializability: 

Serializability Assumption: Every serial schedule is correct Goal: Like to find non-serial schedules which are also correct A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactions When are two schedules equivalent? Option 1: They lead to same result (result equivalent) Option 2: The order of any two conflicting operations is the same (conflict equivalent)

Result Equivalent Schedules: 

Result Equivalent Schedules Two schedules are result equivalent if they produce the same final state of the database Problem: May produce same result by accident! S1 read_item(X); X:=X+10; write_item(X); S2 read_item(X); X:=X*1.1; write_item(X); Schedules S1 and S2 are result equivalent for X=100 but not in general

Conflict Equivalent Schedules: 

Conflict Equivalent Schedules Two schedules are conflict equivalent, if the order of any two conflicting operations is the same in both schedules

Conflict Equivalence: 

Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); order matters order doesn’t matter order matters Serial Schedule S1 order doesn’t matter

Conflict Equivalence: 

Conflict Equivalence T1 read_item(A); read_item(B); write_item(A); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); S1 and S1’ are conflict equivalent (S1’ produces the same result as S1) Schedule S1’ same order as in S1 same order as in S1

Conflict Equivalence: 

Conflict Equivalence T1 read_item(A); write_item(A); read_item(B); write_item(B); T2 read_item(A): write_item(A); read_item(B); write_item(B); Schedule S1’’ is not conflict equivalent to S1 (produces a different result than S1) Schedule S1’’ different order as in S1 different order as in S1

Conflict Serializable: 

Conflict Serializable Schedule S is conflict serializable if it is conflict equivalent to some serial schedule S’ Can reorder the non-conflicting operations to improve efficiency Non-conflicting operations: Reads and writes from same transaction Reads from different transactions Reads and writes from different transactions on different data items Conflicting operations: Reads and writes from different transactions on same data item

Example: 

Example T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); T1 read_item(X); X:=X-N; write_item(X); read_item(Y); Y:=Y+N; write_item(Y); T2 read_item(X); X:=X+M; write_item(X); Schedule A Schedule B B is conflict equivalent to A  B is serializable

Test for Serializability: 

Test for Serializability Construct a directed graph, precedence graph, G = (V, E) V: set of all transactions participating in schedule E: set of edges Ti  Tj for which one of the following holds: Ti executes a write_item(X) before Tj executes read_item(X) Ti executes a read_item(X) before Tj executes write_item(X) Ti executes a write_item(X) before Tj executes write_item(X) An edge Ti  Tj means that in any serial schedule equivalent to S, Ti must come before Tj If G has a cycle, than S is not conflict serializable If not, use topological sort to obtain serialiazable schedule (linear order consistent with precedence order of graph)

Sample Schedule S: 

Sample Schedule S T1 read_item(X); write_item(X); read_item(Y); write_item(Y); T2 read_item(Z); read_item(Y); write_item(Y); read_item(X); write_item(X); T3 read_item(Y); read_item(Z); write_item(Y); write_item(Z);

Precedence Graph for S: 

Precedence Graph for S T1 T2 T3 X,Y Y Y,Z Equivalent Serial Schedule: T3  T1  T2 (precedence order) no cycles  S is serializable

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability Being serializable is not the same as being serial   Being serializable implies that the schedule is a correct schedule. It will leave the database in a consistent state. The interleaving is appropriate and will result in a state as if the transactions were serially executed, yet will achieve efficiency due to concurrent execution.

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability Serializability is hard to check. Interleaving of operations occurs in an operating system through some scheduler Difficult to determine beforehand how the operations in a schedule will be interleaved.

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability Practical approach: Come up with methods (protocols) to ensure serializability. It’s not possible to determine when a schedule begins and when it ends. Hence, we reduce the problem of checking the whole schedule to checking only a committed project of the schedule (i.e. operations from only the committed transactions.) Current approach used in most DBMSs: Use of locks with two phase locking

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability View equivalence: A less restrictive definition of equivalence of schedules View serializability: definition of serializability based on view equivalence. A schedule is view serializable if it is view equivalent to a serial schedule.

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability Two schedules are said to be view equivalent if the following three conditions hold: The same set of transactions participates in S and S’, and S and S’ include the same operations of those transactions. For any operation Ri(X) of Ti in S, if the value of X read by the operation has been written by an operation Wj(X) of Tj (or if it is the original value of X before the schedule started), the same condition must hold for the value of X read by operation Ri(X) of Ti in S’. If the operation Wk(Y) of Tk is the last operation to write item Y in S, then Wk(Y) of Tk must also be the last operation to write item Y in S’.

Characterizing Schedules based on Serializability : 

Characterizing Schedules based on Serializability The premise behind view equivalence: As long as each read operation of a transaction reads the result of the same write operation in both schedules, the write operations of each transaction musr produce the same results. “The view”: the read operations are said to see the the same view in both schedules.

Characterizing Schedules based on Serializability: 

Characterizing Schedules based on Serializability Relationship between view and conflict equivalence: The two are same under constrained write assumption which assumes that if T writes X, it is constrained by the value of X it read; i.e., new X = f(old X) Conflict serializability is stricter than view serializability. With unconstrained write (or blind write), a schedule that is view serializable is not necessarily conflict serialiable. Any conflict serializable schedule is also view serializable, but not vice versa.

Characterizing Schedules based on Serializability: 

Characterizing Schedules based on Serializability Relationship between view and conflict equivalence (cont): Consider the following schedule of three transactions T1: r1(X), w1(X); T2: w2(X); and T3: w3(X): Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3; In Sa, the operations w2(X) and w3(X) are blind writes, since T1 and T3 do not read the value of X. Sa is view serializable, since it is view equivalent to the serial schedule T1, T2, T3. However, Sa is not conflict serializable, since it is not conflict equivalent to any serial schedule.

Concurrency Control Techniques: 

Concurrency Control Techniques Lecture Notes

Outline of Lecture: 

Outline of Lecture Concurrency Based on Locking Techniques Locks 2-Phase Locking Protocol

Remember ACID Properties?: 

Remember ACID Properties? Atomicity  enforced by transaction recovery subsystem Consistency  enforced by application programmers and part of DBMS that enforces integrity constraints Isolation  enforced by concurrency control system Durability  enforced by transaction recovery subsystem Now, focus on concurrency control Ensure serializability of schedules through protocols that result in serializable schedules

Quick Recap: 

Quick Recap Difference between serial and serializable Real DBMS does not test for serializability Very inefficient since transactions are continuously arriving Would require a lot of undoing Solution: concurrency protocols If followed by every transaction, and if enforced by transaction processing system, guarantee serializability of schedules in which transaction appears

Concurrency Control Through Locks: 

Concurrency Control Through Locks Lock: variable associated with each data item Describes status of item wrt operations that can be performed on it Binary locks: Locked/unlocked Enforces mutual exclusion Multiple-mode locks: Read/write a.k.a. Shared/Exclusive Three operations read_lock(X) write_lock(X) unlock(X) Each data item can be in one of three lock states

Implementation: 

Implementation Maintain lock table Keep track of locked items and their locks <data item, LOCK, no_of_reads, locking_transaction> For read locks, keep track of the number of transactions that hold a read lock on an item

Locking Rules: 

Locking Rules T must issue read_lock(X) or write_lock(X) before any read_item(X) op is performed in T T must issue write_lock(X) before any write_item(X) op is performed in T T must issue unlock(X) after all read_item(X) and write_item(X) ops are completed in T T will not issue a read_lock(X) if it already holds a read lock or write lock on X (may be relaxed) T will not issue a write_lock(X) if it already holds a read lock or write lock on X (may be relaxed)

Lock Conversions: 

Lock Conversions Sometimes beneficial to relax locking rules 4 and 5 Upgrade read lock on X to a write lock (by issuing a write_lock(X) ) Only possible if T is the only transaction holding a read lock on X Downgrade a write lock by issuing a read_lock(X) Must be noted in lock table

Granting of Locks: 

Granting of Locks Suppose T2 has read-lock on item X T1 is requesting write-lock on item X; needs to wait for T2 to release T3 requests read-lock on X; request is granted Assume shortly thereafter T2 relinquishes read-lock Continue scenario through a sequence of transactions all requesting read-lock on X T1 will never make progress T1 is said to be starved

Granting of Locks: 

Granting of Locks How do you avoid starvation in the presence of locks? Assume Ti requesting lock on Q Grant lock provided that No locking conflict with lock requested by Ti, OR No other transaction waiting for lock and made request before Ti

Two Transactions: 

Two Transactions T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Let’s assume serial schedule S1: T1;T2 Initial values: X=20, Y=30  Result: X=50, Y=80

Locks Alone Don’t Do the Trick!: 

Locks Alone Don’t Do the Trick! T1 read_lock(Y); read_item(Y); unlock(Y); write_lock(X); read_item(X); X:=X+Y; write_item(X); unlock(X); T2 read_lock(X); read_item(X); unlock(X); write_lock(Y); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Non-serializable! Result: X=50, Y=50 unlocked too early! Let’s run T1 and T2 in interleafed fashion Schedule S

Two-Phase Locking (2PL): 

Two-Phase Locking (2PL) Def.: Transaction is said to follow the two-phase-locking protocol if all locking operations precede the first unlock operation Expanding (growing) = first phase Shrinking = second phase During the shrinking phase no new locks can be acquired! Downgrading ok Upgrading is not

Example: 

Example T1’ read_lock(Y); read_item(Y); write_lock(X); unlock(Y); read_item(X); X:=X+Y; write_item(X); unlock(X); T2’ read_lock(X); read_item(X); write_lock(Y); unlock(X); read_item(Y); Y:=X+Y; write_item(Y); unlock(Y); Both T1’ and T2’ follow the 2PL protocol Any schedule including T1’ and T2’ is guaranteed to be serializable Limits the amount of concurrency

Variations to the Basic Protocol: 

Variations to the Basic Protocol Previous technique knows as basic 2PL Conservative 2PL (static) 2PL: Lock all items needed BEFORE execution begins by predeclaring its read and write set If any of the items in read or write set is already locked (by other transactions), transaction waits (does not acquire any locks) Deadlock free but not very realistic

Variations to the Basic Protocol: 

Variations to the Basic Protocol Strict 2PL: Transaction does not release its write locks until AFTER it aborts/commits Not deadlock free but guarantees recoverable schedules (strict schedule: transaction can neither read/write X until last transaction that wrote X has committed/aborted) Most popular variation of 2PL

Variations to the Basic Protocol: 

Variations to the Basic Protocol Rigorous 2PL: No lock is released until after abort/commit Transaction is in its expanding phase until it ends

Concluding Remarks: 

Concluding Remarks Concurrency control subsystem is responsible for inserting locks at right places into your transaction Strict 2PL is widely used Requires use of waiting queue All 2PL locking protocols guarantee serializability Does not permit all possible serial schedules Conservative and rigorous 2PL charge a high price for serializability However, deadlock-based algorithms may suffer from starvation and deadlock (see next lecture)

Concurrency Control Techniques: 

Concurrency Control Techniques Lecture Notes

Outline of Lecture: 

Outline of Lecture Two Problems with Locks Deadlock Starvation Concurrency Control Based on Timestamp Ordering

What is Deadlock?: 

What is Deadlock? Occurs when each transaction Ti in a set of two or more is waiting on an item locked by some other transaction Tj in the set T1 read_lock(Y); read_item(Y); write_lock(X); T2 read_lock(X); read_item(X); write_lock(Y); T1 T2 S

Deadlock Prevention: 

Deadlock Prevention Locking as deadlock prevention leads to very inefficient schedules (e.g., conservative 2PL) Better, use transaction timestamp TS(T) TS is unique identifier assigned to each transaction if T1 starts before T2, then TS(T1) < TS(T2) (older has smaller timestamp value) Wait-die and wound-wait schemes

Wait-Die Scheme: 

Wait-Die Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), then Ti is allowed to wait Otherwise, Ti younger than Tj, abort Ti (Ti dies) and restart later with SAME timestamp Older transaction is allowed to wait on younger transaction Younger transaction requesting an item held by older transaction is aborted and restarted

Wound-Wait Scheme: 

Wound-Wait Scheme Assume Ti tries to lock X which is locked by Tj If TS(Ti) < TS(Tj) (Ti older than Tj), abort Tj (Ti wounds Tj) and restart later with SAME timestamp Otherwise, Ti younger than Tj, Ti is allowed to wait Younger transaction is allowed to wait on older transaction Older transaction requesting item held by younger transaction preempts younger one by aborting it Both schemes abort younger transaction that may be involved in deadlock Both deadlock free but may cause needless aborts

More Deadlock Prevention: 

More Deadlock Prevention Waiting schemes (require no timestamps) No waiting: if transaction cannot obtain lock, aborted immediately and restarted after time t  needless restarts Cautious waiting: Suppose Ti tries to lock item X which is locked by Tj If Tj is not blocked, Ti is blocked and allowed to wait O.w. abort Ti Cautious waiting is deadlock-free

Deadlock Detection: 

Deadlock Detection DBMS checks if deadlock has occurred Works well if few short transactions with little interference O.w., use deadlock prevention Two approaches to deadlock detection: Wait-for graph If cycle, abort one of the transactions (victim selection) Timeouts

Starvation: 

Starvation Transaction cannot continue for indefinite amount of time while others proceed normally When? Unfair waiting scheme with priorities for certain transactions E.g., in deadlock detection, if we choose victim always based on cost factors, same transaction may always be picked as victim Include rollbacks in cost factor