UPDATE Statements and Indexes

May 29, 2009 · 2 comments

I ran into a database earlier this week with a lot of stored procedures that look like this:

CREATE PROCEDURE updateUser
  @id int, @FirstName varchar(50), @LastName varchar(50), @EmailAddress varchar(50)
AS
SET NOCOUNT ON
 
UPDATE Users
SET FirstName = @FirstName,
    LastName  = @LastName,
    EmailAddress = @EmailAddress
Where id = @id
GO

Every table had  similar CRUD procedures that had been generated. If any data in the Users table changed, say a user updated their email address, this procedure would handle the update. Is this a good idea?

Aaron Alton recently posted about UPDATE statements. In his post, he explains why it is a good idea to use the WHERE clause to filter out rows that don’t need to be updated. I want to expound on Aaron’s point, and say that you should also avoid needlessly updating columns that don’t need to be updated. The reason? Nonclustered Indexes.

Let’s take a look at the following table:

CREATE TABLE Users (
  id int,
  FirstName varchar(50),
  LastName varchar(50),
  EmailAddress varchar(50)
)
 
CREATE UNIQUE CLUSTERED INDEX cix_Users ON Users(id)
CREATE NONCLUSTERED INDEX ix_Users_FirstName ON Users(FirstName)
CREATE NONCLUSTERED INDEX ix_Users_LastName ON Users(LastName)
CREATE NONCLUSTERED INDEX ix_Users_EmailAddress ON Users(EmailAddress)
 
INSERT INTO Users
SELECT 1, 'John', 'Smith', 'jsmith@gmail.com'

If we were to call the updateUser procedure:

EXEC updateUser 1, 'John', 'Smith', 'jsmith@hotmail.com'

Even though the FirstName and LastName values aren’t changing, SQL Server will still update all of the columns resulting in the nonclustered indexes on the FirstName and LastName columns being locked and updated.

If we were to run the following UPDATE statement instead, the nonclustered indexes on FirstName and LastName would not need to be updated.

UPDATE Users
SET EmailAddress = 'jsmith@hotmail.com'
WHERE id = 1

So, how do we fix the problem with our update procedure listed above? If we know that updating an Email address is a common occurance, we might create a seperate procedure that only updates the EmailAddress column. Alternatively, we can use dynamic SQL to build the correct UPDATE statement for us.

Here is an example of how you could do this in a stored procedure:

CREATE PROCEDURE updateUser
  @id int, @FirstName varchar(50), @LastName varchar(50), @EmailAddress varchar(50)
AS
SET NOCOUNT ON
 
--Variables to hold the updated status
DECLARE @u_FirstName bit, @u_LastName bit, @u_EmailAddress bit
 
--Check to see which values were updated
SELECT
  @u_FirstName = CASE FirstName WHEN @FirstName THEN 0 ELSE 1 END
 ,@u_LastName = CASE LastName WHEN @LastName THEN 0 ELSE 1 END
 ,@u_EmailAddress = CASE EmailAddress WHEN @EmailAddress THEN 0 ELSE 1 END
FROM Users
WHERE @id = id
 
--If none of the values were updated return
IF (@u_FirstName = 0 AND @u_LastName = 0 AND @u_EmailAddress = 0) RETURN
 
DECLARE @sql nvarchar(4000)
 
SET @sql = '
DECLARE @first bit --for the first value
UPDATE Users SET
 @first = 1 '
 
IF @u_FirstName = 1 SET @sql += '
 ,FirstName = @FirstName '
 
IF @u_LastName = 1 SET @sql += '
 ,LastName = @LastName '
 
IF @u_EmailAddress = 1 SET @sql += '
 ,EmailAddress = @EmailAddress '
 
SET @sql += '
WHERE id = @id'
 
PRINT @sql
PRINT ''
EXEC sp_executesql @sql,
  N'@FirstName varchar(50), @LastName varchar(50), @EmailAddress varchar(50), @id int',
  @FirstName, @LastName, @EmailAddress, @id
 
GO

This trigger will show which columns have been updated:

CREATE TRIGGER t_Users
  ON  Users
  AFTER UPDATE
AS
       SET NOCOUNT ON
       IF UPDATE(FirstName) PRINT 'FirstName updated'
       IF UPDATE(LastName) PRINT 'LastName updated'
       IF UPDATE(EmailAddress) PRINT 'EmailAddress updated'
GO

To summarize, limiting the columns in the SET portion of the UPDATE statement will reduce locking, minimize index updates, and increase concurrency.

Let me know if I’ve left anything out.

{ 2 comments… read them below or add one }

Denis Langlais May 29, 2009 at 1:38 pm

Some people might say “dynamic sql” is worse than
multiple index update

Rob Boek May 29, 2009 at 2:59 pm

Dynamic SQL is only one option. As I said in the post, you can create multiple procedures to handle the most frequently updated value combinations.

Leave a Comment