In this post I will discuss few interesting facts and problems with NULL in sql server .
# temp contains below data.
Now ,
#temp1's data
Null signifies ‘absence
of any data value’ .In other words value is missing or value is unknown.
It is important to understand that NULL is not zero or empty value .
NULL propagation
NULL propagation
An expression involving NULL will result in NULL .(with few
exception which I will discuss in part 2).This is called NULL propagation.
As I said NULL means value is unknown, so if we try to do
some operation on unknown data it should give me another unknown data, isn't it ?
Eg. Try below in sql server
NULL + 100 -> add 100 to an unknown value. Clearly I ll get another unknown value i.e. NULL
NULL + 100 -> add 100 to an unknown value. Clearly I ll get another unknown value i.e. NULL
Similarly NULL * 10 -> NULL
NULL * 0 -> NULL (Interesting)
NULL / 0 -> NULL (Amazing...no divide by zero exception!!!)
On a lighter note, it seems for MS SQL server creators NULL is bigger than 0 .
Three valued logic:
We know about Boolean algebra. The typical TRUE/FALSE thing.
Here we compare two values which are known. It results in either TRUE or FALSE.
Now if we do comparison with a NULL , then !!!We can’t get a Boolean value . Rather it results in UNKNOWN.So we now have 3 values possible from a comparison
TRUE/FALSE/UNKNOWN . Its not surely bollean logic . Its three valued logic.
Lets see some example .
DECLARE @var as VARCHAR(10)=NULL
IF @var='satya'
SELECT 1
ELSE
SELECT 0
Here
@var is NULL . I am trying to compare NULL with a known value. So as I said it
will result as UNOWN. Sql server passes anything other than TRUE to else clause
. So output is 0 .
Similarly for case statement any comparison not TRUE will go to else part.
Similarly for case statement any comparison not TRUE will go to else part.
SELECT CASE WHEN @var='satya' THEN 1 ELSE 0 END
Same is for while
loop too. It will not enter the loop if comparison involves a NULL .
As 2 valued logic
has a truth table , our three valued logic too have a truth table.
AND
|
TRUE
|
UNKNOWN
|
FALSE
|
TRUE
|
TRUE
|
UNKNOWN
|
FALSE
|
UNKNOWN
|
UNKNOWN
|
UNKNOWN
|
FALSE*
|
FALSE
|
FALSE
|
FALSE*
|
FALSE
|
*Simple rule of AND operator, if one expression results false whole
operation is false
OR
|
TRUE
|
UNKNOWN
|
FALSE
|
TRUE
|
TRUE
|
TRUE*
|
TRUE
|
UNKNOWN
|
TRUE*
|
UNKNOWN
|
UNKNOWN
|
FALSE
|
TRUE
|
UNKNOWN
|
FALSE
|
*Simple rule of OR operator, if one expression results TRUE whole
operation is true
NOT
TRUE
|
FALSE
|
FALSE
|
TRUE
|
UNKNOWN
|
UNKNOWN
|
Let’s have an
example to demonstrate .
CREATE TABLE #temp
(
name varchar(20),
age int
)
INSERT INTO #temp VALUES('satya',26)
INSERT INTO #temp VALUES('biswa',29)
INSERT INTO #temp VALUES('panu',15)
INSERT INTO #temp VALUES('rahul',NULL)
INSERT INTO #temp VALUES('devi',35)
# temp contains below data.
Satya
|
26
|
Biswa
|
29
|
Panu
|
15
|
Rahul
|
NULL
|
Devi
|
35
|
Now ,
SELECT * FROM #temp WHERE age>10 AND name='rahul'
Here age>10
evaluates to UNKNOWN and name='rahul' evaluates to TRUE at 4th row.
So comparison is done as UNKNOWN AND TRUE. So from truth table it is UNKNOWN. So it will not be included in result set.
So comparison is done as UNKNOWN AND TRUE. So from truth table it is UNKNOWN. So it will not be included in result set.
Now let’s change a
bit
SELECT * FROM #temp WHERE age>10 OR name='rahul'
Here for 4th
row comparison is done as 'UNKNOWN OR TRUE' .From truth table it is TRUE.So 4th
row will be included in resultset.
Similar test can
be done while checking a query against a table with NULL data..
The NOT IN problem.
Lets create
another temporary table in addition to #temp.
CREATE TABLE #temp1
(
newage int
)
INSERT INTO #temp1 VALUES(26)
INSERT INTO #temp1 VALUES(15)
INSERT INTO #temp1 VALUES(NULL)
#temp1's data
26
|
15
|
NULL
|
Now,
SELECT * FROM #temp WHERE age IN(SELECT newage from
#temp1)
Will
translate to
SELECT * FROM #temp WHERE age = 26 OR age =15 OR
age = NULL
Here
where age=26 , comparison is done as TRUE OR FALSE OR UNKNOWN.->TRUE
For
age = 15 , its FALSE OR TRUE OR UNKNOWN->TRUE
For
age = NULL , its UNKNOWN OR UNKNOWN OR UNKNOWN -> UNKNOWN , so excluded
So
RESULT :
satya
|
26
|
panu
|
15
|
Now
see following
SELECT * FROM #temp WHERE age NOT IN(SELECT newage from #temp1)
Translates to
SELECT * FROM #temp WHERE age <> 26 AND age <>15 AND age
<> NULL
When
age = 26 , it is doing FALSE AND TRUE AND UNKNOWN -> FALSE
When
age = 15 , its TRUE AND FALSE AND UNKNOWN -> FALSE
WHEN
age=NULL , its UNKNOWN OR UNKNOWN OR UNKNOWN -> UNKNOWN
But
when age = 29 , its TRUE AND TRUE AND UNKNOWN -> UNKNOWN , so excluded(!!!)
So we ll have empty result set (which is wrong). So we need to be careful when using NOT IN.
One
solution is
SELECT * FROM
#temp WHERE age NOT
IN(SELECT
newage from #temp1 where
newage IS NOT NULL)
(We will see IS NOT NULL in detail in part2)
O/P
biswa
|
29
|
devi
|
35
|
But this will miss NULL rows . So
preferable way is to use NOT EXISTS
SELECT * FROM
#temp WHERE
NOT EXISTS(SELECT 1 from #temp1 WHERE age=newage)
Here any 'age' in table #temp for which the subquery doesnot return any row will be included in resultset .
SELECT 1 from #temp1 WHERE age=newage Translates to
SELECT 1 from #temp1 WHERE age=26 OR
age=15 OR age=NULL
For
age=NULL , it is doing UNKNOWN OR UNKNOWN OR UNKNOWN -> UNKNOWN
So
the row with a null will be included in final output of
SELECT * FROM
#temp WHERE
NOT EXISTS(SELECT 1 from #temp1 WHERE age=newage)
O/P
biswa
|
29
|
rahul
|
NULL
|
devi
|
35
|
Clearly
this approach preferable as it will include null row .If requirement needs rows
without null we can easily modify it to our needs.
So always prefer to use NOT EXISTS ( I never use NOT IN in my stored procedures due to this reason) .
So always prefer to use NOT EXISTS ( I never use NOT IN in my stored procedures due to this reason) .
interesting and very analytical post..gud work buddy..
ReplyDelete@Anonymous
ReplyDeletethanks mate