I ran into a database earlier this week with a lot of stored procedures that look like this:
CREATE PROCEDURE updateUser @id INT, @FirstName VARCHAR(50), @LastName VARCHAR(50), @EmailAddress VARCHAR(50) AS SET NOCOUNT ON UPDATE Users SET FirstName = @FirstName, LastName = @LastName, EmailAddress = @EmailAddress WHERE id = @id GO
Every table had similar CRUD procedures that had been generated. If any data in the Users table changed, say a user updated their email address, this procedure would handle the update. Is this a good idea?
Aaron Alton recently posted about UPDATE statements. In his post, he explains why it is a good idea to use the WHERE clause to filter out rows that don’t need to be updated. I want to expound on Aaron’s point, and say that you should also avoid needlessly updating columns that don’t need to be updated. The reason? Nonclustered Indexes.
Let’s take a look at the following table:
CREATE TABLE Users ( id INT, FirstName VARCHAR(50), LastName VARCHAR(50), EmailAddress VARCHAR(50) ) CREATE UNIQUE CLUSTERED INDEX cix_Users ON Users(id) CREATE NONCLUSTERED INDEX ix_Users_FirstName ON Users(FirstName) CREATE NONCLUSTERED INDEX ix_Users_LastName ON Users(LastName) CREATE NONCLUSTERED INDEX ix_Users_EmailAddress ON Users(EmailAddress) INSERT INTO Users SELECT 1, 'John', 'Smith', 'email@example.com'
If we were to call the updateUser procedure:
EXEC updateUser 1, 'John', 'Smith', 'firstname.lastname@example.org'
Even though the FirstName and LastName values aren’t changing, SQL Server will still update all of the columns resulting in the nonclustered indexes on the FirstName and LastName columns being locked and updated.
If we were to run the following UPDATE statement instead, the nonclustered indexes on FirstName and LastName would not need to be updated.
UPDATE Users SET EmailAddress = 'email@example.com' WHERE id = 1
So, how do we fix the problem with our update procedure listed above? If we know that updating an Email address is a common occurance, we might create a seperate procedure that only updates the EmailAddress column. Alternatively, we can use dynamic SQL to build the correct UPDATE statement for us.
Here is an example of how you could do this in a stored procedure:
CREATE PROCEDURE updateUser @id INT, @FirstName VARCHAR(50), @LastName VARCHAR(50), @EmailAddress VARCHAR(50) AS SET NOCOUNT ON --Variables to hold the updated status DECLARE @u_FirstName BIT, @u_LastName BIT, @u_EmailAddress BIT --Check to see which values were updated SELECT @u_FirstName = CASE FirstName WHEN @FirstName THEN 0 ELSE 1 END ,@u_LastName = CASE LastName WHEN @LastName THEN 0 ELSE 1 END ,@u_EmailAddress = CASE EmailAddress WHEN @EmailAddress THEN 0 ELSE 1 END FROM Users WHERE @id = id --If none of the values were updated return IF (@u_FirstName = 0 AND @u_LastName = 0 AND @u_EmailAddress = 0) RETURN DECLARE @SQL NVARCHAR(4000) SET @SQL = ' DECLARE @first bit --for the first value UPDATE Users SET @first = 1 ' IF @u_FirstName = 1 SET @SQL += ' ,FirstName = @FirstName ' IF @u_LastName = 1 SET @SQL += ' ,LastName = @LastName ' IF @u_EmailAddress = 1 SET @SQL += ' ,EmailAddress = @EmailAddress ' SET @SQL += ' WHERE id = @id' PRINT @SQL PRINT '' EXEC SP_EXECUTESQL @SQL, N'@FirstName varchar(50), @LastName varchar(50), @EmailAddress varchar(50), @id int', @FirstName, @LastName, @EmailAddress, @id GO
This trigger will show which columns have been updated:
CREATE TRIGGER t_Users ON Users AFTER UPDATE AS SET NOCOUNT ON IF UPDATE(FirstName) PRINT 'FirstName updated' IF UPDATE(LastName) PRINT 'LastName updated' IF UPDATE(EmailAddress) PRINT 'EmailAddress updated' GO
To summarize, limiting the columns in the SET portion of the UPDATE statement will reduce locking, minimize index updates, and increase concurrency.
Let me know if I’ve left anything out.