Ian's been having some deadlock trouble with SQL Server at work. I tried, but failed, to explain that two server processes running the same stored procedure could deadlock.
The problem comes when you need to update some rows in a table, but only when certain other data in each row is set. You can often do this simply by using the WHERE clause in the UPDATE statement, but if you need to set different values depending on the current values, or you need to update multiple tables simultaneously, it becomes more complicated. So we use a SELECT to get the current values and an UPDATE to write the new values, if we choose to.
The first thing to do is to ensure that we only write back if the data hasn't been changed. In SQL, each statement is atomic - either all of its effects are applied, or none are. However, here we need two statements, so we wrap them up in a transaction:
BEGIN TRAN
SELECT
@value = Col1
FROM Tbl
WHERE
RowID = @rowID
UPDATE Tbl
SET Col1 = @newValue
WHERE RowID = @rowID
-- Note, should check @@ERROR and ROLLBACK TRAN
-- if the update failed
COMMIT TRAN
Looks fine, right? Not always. Now I need to explain how SQL Server locks work.
Like all concurrent systems, SQL Server typically has more clients than available resources. It has to give an illusion of concurrent operations. The really, really hard way to allow transactions to operate simultaneously is to allocate new resources for every possibly-contending operation and reconcile them at the end of any atomic operation. The easy way is to lock the object to prevent contending operations, then release the lock at the end of the atomic operations. Locking reduces concurrency, but encourages correctness in a simple fashion.
SQL Server uses locking for concurrency. This is fine so long as locks aren't held for a long period of time. Reading a row takes a shared lock until the end of the atomic operation; updating a row takes an exclusive lock. If a shared lock is held, any other process can take a shared lock; a process wanting an exclusive lock must wait. If an exclusive lock is held, all other processes wanting a lock must wait.
With our query above, the SELECT takes a shared lock and holds it; the UPDATE escalates the shared lock to an exclusive lock.
Now, what happens if we run this query on another connection? Let's say we have queries Q1 and Q2, and to simplify things, let's assume that the server has a single processor. If the scheduler decides to run Q1, and is then interrupted to execute Q2, the following could happen: the SELECT from Q1 runs and takes a shared lock. Then, the SELECT from Q2 runs and takes another shared lock. Now Q2 is interrupted and the scheduler runs Q1 again, which tries to take an exclusive lock, which is blocked by Q2's shared lock. Q1 blocks so the scheduler runs Q2, which tries to take an exclusive lock to do an UPDATE, but is blocked by Q1's shared lock. Result: deadlock - neither Q1 nor Q2 can progress because they're both waiting for the other to finish.
You could give SQL Server a lock hint to take an exclusive lock instead of a shared lock when executing the SELECT, by specifying (XLOCK) after the table name. This stops the deadlock, because both will now try to acquire the exclusive lock, which means one must wait for the other. This has the nasty side-effect of preventing anyone else who just wanted to read the data from reading until we decide to update.
For this reason, SQL Server has another lock type: an update lock. The rules for this lock are simple. If no lock is held, or only shared locks are held, the update lock can be taken. Only one process can have an update lock, but other processes can take shared locks while an update lock is held. If the process holding the update lock wants to write, it is upgraded to an exclusive lock. So if we add the update lock hint (UPDLOCK) to our SELECT, Q1 and Q2 will now perform atomically, one after another, without deadlocking, while other processes can read the selected rows (at least, until we UPDATE).
BEGIN TRAN
SELECT
@value = Col1
FROM Tbl (UPDLOCK)
WHERE
RowID = @rowID
UPDATE Tbl
SET Col1 = @newValue
WHERE RowID = @rowID
-- Note, should check @@ERROR and ROLLBACK TRAN
-- if the update failed
COMMIT TRAN