Monday, 19 November 2012

Storing and Retrieving Non-English Unicode Characters (Hindi, Czech, Arabic etc.) in SQL Server

If you have a requirement to store and retrieve any other language characters in SQL Server besides English, you must do the following -
  1. Use a Unicode compatible data-type for the table column. NVACHAR, NCHAR, NTEXT are the data-types in SQL Server that can be used for storing non-English characters.
  2. Precede the Unicode data values with a N (capital letter) to let the SQL Server know that the following data is from Unicode character set. More details in this MSDN article. The N should be used even in the WHERE clause.
If the correct data-type is not used or the data is not preceded with a N, the table will save it as ‘?’ or other garbled character.
I have left out far-east languages like Japanese and Chinese from the following example on purpose because those languages have a few other considerations that I’ll save for another blog post.
I have used Google Translate to get the characters of other languages.
01DROP TABLE dbo.unicodeData
02GO
03CREATE TABLE dbo.unicodeData
04( languageUsed VARCHAR(50)
05, unicodeData NVARCHAR(200)
06, nonUnicodeData VARCHAR(200) -- same data in a normal VARCHAR column for comparison
07, comments VARCHAR(100)
08)
09GO
10INSERT INTO dbo.unicodeData (languageUsed, unicodeData, nonUnicodeData, comments)
11VALUES
12 ('English', N'This is an example', N'This is an example', NULL)
13, ('Hindi', N'यह एक उदाहरण है.', N'यह एक उदाहरण है.', 'Using the preceding N in both strings but VARCHAR is still a ?')
14, ('Hindi', 'यह एक उदाहरण है.', 'यह एक उदाहरण है.', 'Not using the preceding N in both strings so both are a ?')
15, ('Kannada', N'ಈ ಒಂದು ಉದಾಹರಣೆಯಾಗಿದೆ.', N'ಈ ಒಂದು ಉದಾಹರಣೆಯಾಗಿದೆ.', NULL)
16, ('Arabic', N'هذا مثال على ذلك.', N'هذا مثال على ذلك.', NULL)
17, ('Czech', N'To je příklad.', N'To je příklad.', NULL);
18GO
19SELECT * FROM dbo.unicodeData;
20GO
21-- Example of using N' in the WHERE clause
22SELECT * FROM dbo.unicodeData
23WHERE unicodeData like N'%एक%';
Unicode Results

Unicode Results
Further reading:
Reference: http://aalamrangi.wordpress.com/2012/05/13/storing-and-retrieving-non-english-unicode-characters-hindi-czech-arabic-etc-in-sql-server/

No comments:

Post a Comment