Write differences between varchar and nvarchar
Currently in our SQL Server 2012 database, we're using
varchar
, and we'd like to change thatnvarchar
. I've generated a script to do that.My question is are there any differences in how SQL Server writes to
varchar
columns vs.nvarchar
columns? We have a number of backend procedures that I'm concerned about.Edit:
Not sure if this helps, but the columns don't have indexes, f/k, or constraints on them.You need to be sure that you prefix Unicode string literals with an N prefix. For example these will work differently if the underlying data type is
NVARCHAR
:CREATE TABLE dbo.t(c NVARCHAR(32)); INSERT dbo.t(c) SELECT 'រៀន'; INSERT dbo.t(c) SELECT 'នរៀ'; INSERT dbo.t(c) SELECT N'រៀន'; SELECT c FROM dbo.t; SELECT c FROM dbo.t WHERE c = 'រៀន'; SELECT c FROM dbo.t WHERE c = N'រៀន';
Results:
c ---- ??? -- not stored correctly ??? -- not stored correctly រៀន -- stored correctly! c ---- ??? ??? -- probably not expected, however all Unicode characters have been changed to ? c ---- រៀន
For those on mobile devices or decrepit browsers that show box characters instead of actual Unicode characters, this is what it looks like:
Thanks for this. How will this work with parameters? Like: select * from MyTable where [email protected] and in which I use Parameters.AddWithValue ("@emailAddress", emailAddress)
The biggest concern is that
nvarchar
uses 2 bytes per character, whereasvarchar
uses 1. Thus,nvarchar(4000)
uses the same amount of storage space asvarchar(8000)
*.In addition to all of your character data needing twice as much storage space, this also means:
- You may have to use shorter
nvarchar
columns to keep rows within the 8060 byte row limit/8000 byte character column limit. - If you're using
nvarchar(max)
columns, they will be pushed off-row sooner thanvarchar(max)
would. - You may have to use shorter
nvarchar
columns to stay within the 900-byte index key limit (I don't know why you would want to use such a large index key, but you never know).
Besides that, working with
nvarchar
isn't much different, assuming your client software is built to handle Unicode. SQL Server will transparently upconvert avarchar
tonvarchar
, so you don't strictly need the N prefix for string literals unless you're using 2-byte (i.e. Unicode) characters in the literal. Be aware that castingnvarchar
tovarbinary
yields different results than doing the same withvarchar
. The important point is that you won't have to immediately change every varchar literal to an nvarchar literal to keep the application working, which helps ease the process.* If you use data compression (the lightweight row compression is enough, Enterprise Edition required before SQL Server 2016 SP1) you will usually find
nchar
andnvarchar
take no more space thanchar
andvarchar
, due to Unicode compression (using the SCSU algorithm).- You may have to use shorter
Think the following are major differences:
- Nvarchar stores UNICODE data. If you have requirements to store UNICODE or multilingual data, nvarchar is the choice. Varchar stores ASCII data and should be your data type of choice for normal use.
- Regarding memory usage, nvarchar uses 2 bytes per character, whereas varchar uses 1.
- JOIN-ing a VARCHAR to NVARCHAR has a considerable performance hit.
- Might need an N prefix when inserts data: INSERT dbo.t(c) SELECT N'ʤ ʥ ʦ ʧ ʨ';
- Some experts recommends nvarchar always because: since all modern operating systems and development platforms use Unicode internally, using nvarchar rather than varchar, will avoid encoding conversions every time you read from or write to the database
nvarchar was required for RDP Merge Replication from a Mobile DB to SQL Server 2005. Also LTrim(), RTrim() & Trim() were used a lot bc nvarchar didn't automatically trim() off spaces from data entry, whereas Varchar did.
I am not aware if that has changed in recent years or not, but nvarchar is now the standard used for .NET Simple Membership Website logins on VS Pro 2017 used in the generated database.
If you use NVarchar over Varchar and you have no requirement to support MULTI-LINQUAL, you increase storage for DB, Backups (local and offsite). Modern Databases should support both and any Conversion hits should be considered in the design.
License under CC-BY-SA with attribution
Content dated before 6/26/2020 9:53 AM
Aaron Bertrand 2 years ago
Also see https://dba.stackexchange.com/questions/162113/would-using-varchar5000-be-bad-compared-to-varchar255/