To Support Unicode, two things need to be done. In VC++, go to the project menu and choose settings. On the C++ tab chose the category General and add the Preprocessor Definition for _UNICODE *don’t forget the underscore* and UNICODE and REMOVE the _MBCS (multi-byte character set) definition. Second, under the Link tab choose the category Output and set the entry point symbol to wWinMainCRTStartup.
When Unicode version of the application is to be built, both the Win32 compile-time flag UNICODE and the C run-time compile-time flag _UNICODE must be defined.
GENERAL GUIDELINES TO BE FOLLOWED.
1.Once _UNICODE and UNICODE has been defined for the project, a few steps need to be taken to ensure string handling is done properly.
The following steps (digested from the <> from MS Press) should be taken:
The code should be modified to use generic data types. Such as char, char* -> TCHAR and TCHAR*, which defined in the Win32 file WINDOWS.H, or to _TCHAR as defined in the Visual C++ file TCHAR.H. Replace instances of LPSTR and LPCH with LPTSTR and LPTCH.
2. The code should be modified to use generic function prototypes. such as use the C run-time call _tcslen instead of strlen, and use the Win32 API SetWindowText instead of SetWindowTextA.
3. Any character or string literal should be surrounded with the TEXT or _T macro. The TEXT macro conditionally places an "L" in front of a character literal or a string literal definition.
4. Pointer arithmetic should be adjusted. Subtracting char* values yields an answer in terms of bytes; subtracting wchar_t* values yields an answer in terms of 16-bit chunks. When determining the number of bytes (for example, when allocating memory for a string), the length of the string in symbols should be multiplied by sizeof (TCHAR). When determining the number of characters from the number of bytes, divide by sizeof (TCHAR).
5. Character!= byte.
A character is not necessarily one byte. In Asian "multibyte" character encodings, some characters take up 2 bytes or more, while others are one byte each. Do not jump directly into the middle of a byte array. Do not increment a char * pointer by one to move to the next character.
Check for any code that assumes a character is always 1 byte long. Code that assumes a character's value is always less than 256 (for example, code that uses a character value as an index into a table of size 256) must be changed. Make sure your definition of NULL is 16 bits long.
1. DataTypes in ANSI and the Unicode Equivalent:
S.No. | ANSI | Unicode |
1 | | _T |
2 | LPCSTR (const char *) | LPCTSTR (const _TCHAR *) |
3 | char | _TCHAR |
4 | unsigned char | _TUCHAR |
5 | LPSTR (char *) | LPTSTR (_TCHAR *) |
2. Functions in ANSI, and the Unicode equivalent:
S.No. | ANSI | Unicode |
1 | sprintf | _stprintf |
2 | atoi | _ttoi |
3 | _atoi64 | _ttoi64 |
4 | strcpy | _tcscpy |
5 | strcat | _tcscat |
6 | strlen | _tcslen |
7 | fopen | _tfopen |
8 | fprintf | _ftprintf |
9 | atol | _ttol |
10 | strstr | _tcsstr |
11 | ltoa | _ltot |
12 | atol | _ttol |
13 | atof | _tcstod |
14 | itoa | _itot |
15 | strncpy | _tcsncpy |
16 | strcmp | _tcscmp |
17 | sscanf | _stscanf |
18 | strchr | _tcschr |
19 | stricmp | _tcsicmp |
20 | strcspn | _tcscspn |
21 | printf | _tprintf |
22 | Fgets | _fgetts |