Wednesday 3 March 2004

The SELECT/UPDATE problem, or, why UPDLOCK?

Ian's been having some deadlock trouble with SQL Server at work. I tried, but failed, to explain that two server processes running the same stored procedure could deadlock.

The problem comes when you need to update some rows in a table, but only when certain other data in each row is set. You can often do this simply by using the WHERE clause in the UPDATE statement, but if you need to set different values depending on the current values, or you need to update multiple tables simultaneously, it becomes more complicated. So we use a SELECT to get the current values and an UPDATE to write the new values, if we choose to.

The first thing to do is to ensure that we only write back if the data hasn't been changed. In SQL, each statement is atomic - either all of its effects are applied, or none are. However, here we need two statements, so we wrap them up in a transaction:

BEGIN TRAN
SELECT
   @value = Col1
FROM Tbl
WHERE
   RowID = @rowID

UPDATE Tbl
SET Col1 = @newValue
WHERE RowID = @rowID

-- Note, should check @@ERROR and ROLLBACK TRAN
-- if the update failed
COMMIT TRAN

Looks fine, right? Not always. Now I need to explain how SQL Server locks work.

Like all concurrent systems, SQL Server typically has more clients than available resources. It has to give an illusion of concurrent operations. The really, really hard way to allow transactions to operate simultaneously is to allocate new resources for every possibly-contending operation and reconcile them at the end of any atomic operation. The easy way is to lock the object to prevent contending operations, then release the lock at the end of the atomic operations. Locking reduces concurrency, but encourages correctness in a simple fashion.

SQL Server uses locking for concurrency. This is fine so long as locks aren't held for a long period of time. Reading a row takes a shared lock until the end of the atomic operation; updating a row takes an exclusive lock. If a shared lock is held, any other process can take a shared lock; a process wanting an exclusive lock must wait. If an exclusive lock is held, all other processes wanting a lock must wait.

With our query above, the SELECT takes a shared lock and holds it; the UPDATE escalates the shared lock to an exclusive lock.

Now, what happens if we run this query on another connection? Let's say we have queries Q1 and Q2, and to simplify things, let's assume that the server has a single processor. If the scheduler decides to run Q1, and is then interrupted to execute Q2, the following could happen: the SELECT from Q1 runs and takes a shared lock. Then, the SELECT from Q2 runs and takes another shared lock. Now Q2 is interrupted and the scheduler runs Q1 again, which tries to take an exclusive lock, which is blocked by Q2's shared lock. Q1 blocks so the scheduler runs Q2, which tries to take an exclusive lock to do an UPDATE, but is blocked by Q1's shared lock. Result: deadlock - neither Q1 nor Q2 can progress because they're both waiting for the other to finish.

You could give SQL Server a lock hint to take an exclusive lock instead of a shared lock when executing the SELECT, by specifying (XLOCK) after the table name. This stops the deadlock, because both will now try to acquire the exclusive lock, which means one must wait for the other. This has the nasty side-effect of preventing anyone else who just wanted to read the data from reading until we decide to update.

For this reason, SQL Server has another lock type: an update lock. The rules for this lock are simple. If no lock is held, or only shared locks are held, the update lock can be taken. Only one process can have an update lock, but other processes can take shared locks while an update lock is held. If the process holding the update lock wants to write, it is upgraded to an exclusive lock. So if we add the update lock hint (UPDLOCK) to our SELECT, Q1 and Q2 will now perform atomically, one after another, without deadlocking, while other processes can read the selected rows (at least, until we UPDATE).

BEGIN TRAN
SELECT
   @value = Col1
FROM Tbl (UPDLOCK)
WHERE
   RowID = @rowID

UPDATE Tbl
SET Col1 = @newValue
WHERE RowID = @rowID

-- Note, should check @@ERROR and ROLLBACK TRAN
-- if the update failed
COMMIT TRAN

6 comments:

Anonymous said...

A clear explanation of the problem and a solution. Tnx!

Anonymous said...

Excelent, very useful, thank you. I used this approach to generate a custom unique id, using XLOCK.
Thanks.

Anonymous said...

The article states: "With our query above, the SELECT takes a shared lock and holds it; the UPDATE escalates the shared lock to an exclusive lock."

I believe this would only be true if the transaction isolation level was set to REPEATABLE READ or SERIALIZABLE.

At the default isolation level of READ COMMITTED, the shared lock would be released after the SELECT.

Marek Grzenkowicz said...

I agree with EyeRag.

I just checked this - the transaction isolation level has to be at least REPEATABLE READ for this to happen.

ramin said...

Hi dear Mike
I have locking problem...
"I want a user can lock a record of a table and at the same time another user can update another record of that table"

my program is in c#2008 and SQL2000

the architecture of my program is like this:

BeginTransaction(IsolationLevel.ReadCommitted);

"SET LOCK_TIMEOUT -1";
"select id from dbo.BranchTestRamin with (UPDLOCK,readpast) where id = " + pn_id

"SET LOCK_TIMEOUT 50"
..
..
...
"update dbo.branchTestRamin set name='" + po_branc......
...
Transaction.Commit();

user A Lock Record (for example id=12) from station A and user B try to lock id=12 from he couldn't and everything ok but when user B lock id =14(it's example) when he done and wanted to update that record ....I take timeout expired error... you know it wait for the other lock....
what's wrong here????
would you help me please?
really appreciated

yours truly Ramin

Chris said...

I suspect this just saved me days of frustration - Awesome post, thanks!