We expect developers to be able to create stored procedures, write moderately complex SQL statements, and even the occasional trigger where needed. One question I like to ask goes something like this:
"Let's take a scenario using SQL Server 2000 where I'll be passing in two variables (firstname and lastname) to a stored procedure. That procedure should insert them into a table called TEST that has columns also called firstname and lastname. Table TEST has a primary key column named ContactID which is an integer and is also marked as an identity column. How would I obtain and return the primary key for the row just inserted?"
Stop for a moment and see if you know the answer. Do you know how to create the stored procedure? Obtain the value? Return it to the calling application?
A fair question to ask me is - why is this important? For me, it's a fundamental test to see if someone has worked with data in anything beyond a trivial way. Take the standard order/order detail scenario - how do you insert the details if you don't have the primary key of the order? And while you may have had the luck (good?) to work on a system with natural keys, not every system uses them and identities are the most common way of solving key generation in SQL. More importantly, if you ever do work on a system where identities are used and you rely on @@Identity, you could get some very unusual results at some point in the future when someone adds an auditing trigger. It's not a deal breaker question, but it's an interesting one to lead them into a conversation about dealing with related tables.
I get a variety of answers and most of them are shall we say less than optimum. Almost everyone figures how to insert the values and knows to use either an output or return value, but almost everyone trips on the identity portion.
Wrong Answer #1 - Select max(contactid) from Test. This is wrong because it assumes that no one else will be inserting a row. I suppose you could make it work if you used the right isolation level, but doing that will most likely reduce your concurrency. It's also doing more than you need to.
Wrong Answer #2 - Select top 1 contactid from test order by contactid desc. This is wrong for the same reasons described above.
Wrong Answer #3 - Select the row back by querying on other data you inserted into the table, essentially saying that you inserted an alternative primary key made of one or more columns. This would work if your data supported it and guaranteed that those values were indeed unique. Still not a good idea.
Wrong Answer #4 - In this one they almost get it right. They suggest using @@Identity which will work of course (with caveats), but when I ask them if they are any concerns with this technique, I usually get one of the following:
- No, there are no concerns
- You have to query it quickly because it is a database wide setting and you have to get the value before someone else inserts a row into any table in the database.
- Yes, it retrieves the last identity value for the session which is usually the value you want, but could be incorrect if you had a trigger on TEST which inserted rows into another table that also had an identity column. In that case you'd get the identity value from that table instead of TEST (Note: this correctly describes the behavior @@identity exhibits).
Right Answer - Use Scope_Identity() because it's SQL 2000, use @@Identity in SQL 7, and return the result as an output parameter (return value typically reserved for error conditions). Using @@Identity represents a possible bug in the future if auditing were deployed and it used an identity column as well.
Now let's run a couple tests to prove that the right answer is really correct:
create database IdentityTest
use identitytest
create table TEST (ContactID int not null identity (1, 1), firstname varchar(100) null, lastname varchar(100) null)
insert into TEST Default Values
select @@Identity
This will return the value 1. Repeating it will return 2.
insert into TEST Default Values
select Scope_Identity()
This will return a value of 3.
Now let's start by proving that @@Identity can cause strange behavior. We'll create a history table first that has a new identity column, then we'll add an insert trigger to TEST.
create table TESTHISTORY (HistoryID int not null identity (1, 1), ContactID int not null, firstname varchar(100) null, lastname varchar(100) null)
create trigger i_TEST on dbo.TEST for insert as
set nocount on
insert into TESTHISTORY (ContactID, FirstName, LastName) select ContactID, FirstName, LastName from Inserted
Now let's test what happens:
insert into TEST Default Values
select @@Identity
Returns a value of 1. Inspecting TEST shows that the last row we inserted had a value of 4, the only row in TESTHISTORY has a historyid = 1.
insert into TEST Default Values
select Scope_Identity()
( Corrected as correctly pointed out by adk in the comments)
Returns a value of 5.
Inspecting TEST confirms this, and confirms we inserted a second row into TESTHISTORY. Now let's start testing what happens if someone else inserts a row into TEST while we're busily working away in our stored procedure. Using the existing connection, we execute the first part:
insert into TEST Default Values
If we check the table we see that we just inserted row 6. Now open a second connection and execute the same statement:
insert into TEST Default Values
Check the table reveals we just inserted row 7. Now go back to the original connection. We start with someone we know should return the "wrong" result and it does, the value 3.
select @@Identity
Now let's try scope_identity(). If all went well, it should return 6, not 7!
select Scope_Identity()
And it does, supporting the Right Answer detailed above. I know this is SQL trivia, the kind of stuff I think you shouldn't have to delve into, but if you're going to use the platform you have to know how it works. Take this back and quiz your developers, you'll be treating them to some professional development and you may save yourself a large headache one day too.
"Let's take a scenario using SQL Server 2000 where I'll be passing in two variables (firstname and lastname) to a stored procedure. That procedure should insert them into a table called TEST that has columns also called firstname and lastname. Table TEST has a primary key column named ContactID which is an integer and is also marked as an identity column. How would I obtain and return the primary key for the row just inserted?"
Stop for a moment and see if you know the answer. Do you know how to create the stored procedure? Obtain the value? Return it to the calling application?
A fair question to ask me is - why is this important? For me, it's a fundamental test to see if someone has worked with data in anything beyond a trivial way. Take the standard order/order detail scenario - how do you insert the details if you don't have the primary key of the order? And while you may have had the luck (good?) to work on a system with natural keys, not every system uses them and identities are the most common way of solving key generation in SQL. More importantly, if you ever do work on a system where identities are used and you rely on @@Identity, you could get some very unusual results at some point in the future when someone adds an auditing trigger. It's not a deal breaker question, but it's an interesting one to lead them into a conversation about dealing with related tables.
I get a variety of answers and most of them are shall we say less than optimum. Almost everyone figures how to insert the values and knows to use either an output or return value, but almost everyone trips on the identity portion.
Wrong Answer #1 - Select max(contactid) from Test. This is wrong because it assumes that no one else will be inserting a row. I suppose you could make it work if you used the right isolation level, but doing that will most likely reduce your concurrency. It's also doing more than you need to.
Wrong Answer #2 - Select top 1 contactid from test order by contactid desc. This is wrong for the same reasons described above.
Wrong Answer #3 - Select the row back by querying on other data you inserted into the table, essentially saying that you inserted an alternative primary key made of one or more columns. This would work if your data supported it and guaranteed that those values were indeed unique. Still not a good idea.
Wrong Answer #4 - In this one they almost get it right. They suggest using @@Identity which will work of course (with caveats), but when I ask them if they are any concerns with this technique, I usually get one of the following:
- No, there are no concerns
- You have to query it quickly because it is a database wide setting and you have to get the value before someone else inserts a row into any table in the database.
- Yes, it retrieves the last identity value for the session which is usually the value you want, but could be incorrect if you had a trigger on TEST which inserted rows into another table that also had an identity column. In that case you'd get the identity value from that table instead of TEST (Note: this correctly describes the behavior @@identity exhibits).
Right Answer - Use Scope_Identity() because it's SQL 2000, use @@Identity in SQL 7, and return the result as an output parameter (return value typically reserved for error conditions). Using @@Identity represents a possible bug in the future if auditing were deployed and it used an identity column as well.
Now let's run a couple tests to prove that the right answer is really correct:
create database IdentityTest
use identitytest
create table TEST (ContactID int not null identity (1, 1), firstname varchar(100) null, lastname varchar(100) null)
insert into TEST Default Values
select @@Identity
This will return the value 1. Repeating it will return 2.
insert into TEST Default Values
select Scope_Identity()
This will return a value of 3.
Now let's start by proving that @@Identity can cause strange behavior. We'll create a history table first that has a new identity column, then we'll add an insert trigger to TEST.
create table TESTHISTORY (HistoryID int not null identity (1, 1), ContactID int not null, firstname varchar(100) null, lastname varchar(100) null)
create trigger i_TEST on dbo.TEST for insert as
set nocount on
insert into TESTHISTORY (ContactID, FirstName, LastName) select ContactID, FirstName, LastName from Inserted
Now let's test what happens:
insert into TEST Default Values
select @@Identity
Returns a value of 1. Inspecting TEST shows that the last row we inserted had a value of 4, the only row in TESTHISTORY has a historyid = 1.
insert into TEST Default Values
select Scope_Identity()
( Corrected as correctly pointed out by adk in the comments)
Returns a value of 5.
Inspecting TEST confirms this, and confirms we inserted a second row into TESTHISTORY. Now let's start testing what happens if someone else inserts a row into TEST while we're busily working away in our stored procedure. Using the existing connection, we execute the first part:
insert into TEST Default Values
If we check the table we see that we just inserted row 6. Now open a second connection and execute the same statement:
insert into TEST Default Values
Check the table reveals we just inserted row 7. Now go back to the original connection. We start with someone we know should return the "wrong" result and it does, the value 3.
select @@Identity
Now let's try scope_identity(). If all went well, it should return 6, not 7!
select Scope_Identity()
And it does, supporting the Right Answer detailed above. I know this is SQL trivia, the kind of stuff I think you shouldn't have to delve into, but if you're going to use the platform you have to know how it works. Take this back and quiz your developers, you'll be treating them to some professional development and you may save yourself a large headache one day too.
You could read the original article in this link:
http://www.sqlservercentral.com/columnists/awarren/checkyoursqlserveridentity.asp
Thanks to Andy Warren for this article
2 comments:
"[...]the only row in TESTHISTORY has a historyid = 1.
insert into TEST Default Values
select @@Identity
Returns a value of 5. [...]"
That should be "select SCOPE_IDENTITY()" instead of "select @@Identity" above.
Thanks for the great step-by-step test of the differences between the two.
Post a Comment