Monday, September 19, 2011

NVarchar(MAX)/Varchar(Max) in sql server

Quite often we come across datatypes in sql server with a 'Max' as a size !!!
Lets first understand what it stand for.In sql server varchar can have 8000 characters(=8000 bytes), Nvarchar 4000 characters(1 unicode char=2 bytes).In sql server 2005 a new keyword 'MAX' is added aiming to replace older datatypes like TEXT,NTEXT,IMAGE etc.The size of data it can hold is 2 GB , the same size that can be hold by TEXT,NTEXT etc.

Problem with TEXT,NTEXT,IMAGE:
In sql server these 3 datatypes are used as BLOBs(Binary Large Objects).What it means data belonging to these datatypes stored in collection of 8 kb datapages that are separate from datapages that store other data in the same table.A pointer is stored in the original column .So when we try to read the data from table , a LOB table scan(LOB datapages are read) is performed which is expensive.

With Varchar(MAX) data is stored in the same datapages as used by other columns.(of course upto 8kb it is fine , above it sqlserver uses some datapages called as overflow page.But accessing these pages are handled well by sqlserver and no lob datapages needed.).So simple table scan is performed when select statement is used.(faster than a LOB scan)

So when programming sql server we should avoid the use of these BLOB types as these will be removed in future editions of sql server.

An Example:
Take two tables Table_1 and Table_2

Table_1: LobColumn ntext
Table_2:MaxColumn nvarchar(max)

now see the below analisys after inserting say 26 rows

SET STATISTICS IO ON
Select * from Table_1
(26 rows affected)
Table 'Table_1'.Scan count 1 ,logical read 1,physical reads 0, read-ahead reads 0,lob logical reads 52,lob physical reads 0,lob read ahead reads 0

SET STATISTICS IO ON
Select * from Table_2
(26 rows affected)
Table 'Table_2'.Scan count 1 ,logical read 1,physical reads 0, read-ahead reads 0,lob logical reads 0,lob physical reads 0,lob read ahead reads 0

Clearly for Table_2 scan is faster as there is no LOB data read.

Apart from it sql server does not allow BLOB types to be used in stored procedure/user defined functions.Also string library functions like REPLACE etc are not allowd to be used with BLOB types.
NOTE:Always use Varchar unless you need to store multilinguial characters, in which case use Nvarchar.Remember Nvarchar takes more memory than varchar.

How to use MAX datatypes in Stored Procedures:
Consider following statements in a stored procedure :

Declare @maxData nvarchar(max)
SET @maxData='-----some statements--------'

When we use statement in this way , after 4000 chars this will throw exception saying nvarchar can store maximum 4000 characters.
But in principle it should have stored 2GB chars right !!!!
Here LHS is of type nvarchar(max) , RHS is of type simple nvarchar(which can store 4000 bytes).The workaround of the problem is break @maxData and use as below:

@maxData='-----some statements----'
@maxData=@maxData+'-----some statements----'
@maxData=@maxData+'-----some statements----'

In each statement RHS should have a new string with maximum 4000 chars.

(This is a pratical problem I faced in one of my project and found it helpful)
Happy Sql Programming :-)
Please leave an honest comment so that I can improve on my blogging aswell as technical skill :)