SQL Server Unicode Madness: N'' is Null, but not Quite, Null is not Null, and N'' is N'Ǹ' ... What?!?!
Recently a customer asked me funny things. He came across with the following queries:
declare @tmp table (charvalue nvarchar(10), othervalue int);
insert into @tmp values (N'', 4326);
So far, so good. But querying this tiny table variable in a certain way can be very surprising:
select * from @tmp where charvalue = N'Ǹ';
Note the N has an "accent grave".
We expect an empty result set, because an N with whatever accent on its top is very much different from an empty string, right?
This came out:
charvalue othervalue
4326
So, I thought this could need some further investigation. I wanted to find out what the unicode value for "charvalue" would be:
select unicode(charvalue),othervalue from @tmp
Surprising result:
othervalue
NULL 4326
So, according to this result, N'' is null. My professor always told me it's not.
Ok, let's check that:
select * from @tmp where charvalue is null;
This returns
charvalue othervalue
Empty record set... Oops... this means that NULL is not NULL!
Confused? So am I.
Anyways, my customer asked me for a solution as he did not want the above record in his result set when he was filtering for N``'Ǹ'
.
I proposed this type of query to him:
select * from @tmp where charvalue = N'Ǹ' and charvalue !=N'';
This works.
Well, if you don't like it try this which also works fine:
select * from @tmp where charvalue = N'Ǹ' collate SQL_Latin1_General_CP850_BIN2
Both, SQL Server 2008R2 and 2012 have the same opinion on this matter.
Now, is it a bug or a feature?
======================== Update
I filed a connect item and received the following answer:
Quote:
"Thanks for submitting this feedback.
The behavior you are seeing is in fact by design if you are using the default collation for most locales (SQL_Latin1_general_CP1_CI_AS).
In general, I would recommend using a Windows collation at the instance or database level, for example Latin1_general_100_CI_AS_KS_WS.
You can also specify the collation in the query. For example, the following will return an empty result set.
declare @tmp table (charvalue nvarchar(10), othervalue int);
insert into @tmp values (N'', 4326);
select * from @tmp where charvalue collate Latin1_General_100_CI_AS_KS_WS = N'Ǹ';
--
Jos de Bruijn - SQL Server PM "
:Endquote
see: Unicode string comparison in WHERE clause fails with certain characters