It is important to be able to switch language the user sees quickly and easily. The requirement is that the user can select the preferred language while the program is running and instantly the entire user interface should change to the new language without changing or interrupting anything that is going on. It is not satisfactory to stop the program and restart it, or run another version, to change the language displayed.
This week I added support for the German language, in addition to English and Chinese. This went very smoothly, once I had obtained the German translation. So, here is my recipe for multi-language support in a C++ program built with Microsoft Visual Studio.
Create a table which has every character string displayed by the user interface assigned to a number. Each language has its own base number and the translations of each string are assigned a unique number which has the same offset from the language base. For a program that supports English and German, I might choose that the English base number is 40000 and German is 70000. So the English string “Run” might be given the number 40131 and the German string "Geführt" the number 70131.
The numbers are arbitrary, but there are a couple of things to watch out for. The numbers 1000 and upwards are used by Microsoft Visual Studio for all sorts of purposes, so it is best to stay way from this area – starting at 40000 works fine. The language base numbers must be far enough apart that there is no chance that you will run out of room between them – a separation of 10000 should be enough.
The table of numbered strings is saved in a text file which looks like this
STRINGTABLE
BEGIN
40131 “Run”
END
STRINGTABLE
BEGIN
70131 “Geführt”
END
The text file containing the numbered strings table is a resource which is compiled by the resource compiler and linked to the rest of the program. However, it is maintained and edited by using a text editor and must be protected from being changed by the Microsoft Visual resource editor. Do this by naming the file language.rc and storing it in the res subfolder of the project directory. The resource compiler reaches the file through an include to the file in <project folder>/res/<projectname>.rc2
The numbered string table is used by code like this
SetDlgItemText( IDC_RUN,
CString(MAKEINTRESOURCE( myLanguage + 131 ) ) );
The global variable myLanguage contains the base number of the currently selected language. This code must be called every time the GUI is redrawn and also each time the user changes the selected language.
It is convenient for the user if, when the program starts, it remembers the language that was selected last time it was run. When the user changes the language, call this line
AfxGetApp()->WriteProfileInt(L"startup", L"language", myLanguage );
And when the program starts
myLanguage = AfxGetApp()->GetProfileInt(L"startup", L"language", 40000 );
There is a temptation to use defines to replace the string number offsets ( e.g. 131 ) with symbolic constants ( STR_RUN ). I recommend against doing this. It is just another table which must be maintained and once there are more than a few dozen strings, maintenance becomes a pain. The numbered string table is self documenting and, if your are careful assigning the resource IDs ( IDC_RUN ) and use plenty of comments, the code will be self documenting, despite the sprinkling of mysterious numbers ( 131 ) through out.
Now, we come to the support of Chinese and other East Asian languages. Out of the box, Windows will not even display East Asian characters. Here is a link to advice from Robert Y Eng on switching on this support.
The next problem is how to represent the Chinese characters. There are several alternatives here and many technical details. It is easy to get lost for many days in researching and evaluating the alternatives ( I did! ). I am simply going to describe what I do.
The Chinese character strings are represented by 16 bit unicode numbers, using escaped hexadecimal. They look like this:
60131 L"\x8FD0\x884C"
This produces a couple of hieroglyphics which, I am assured, mean “Run” to anyone who can read them.
The advantage of this method is that you just have to add another language base number ( in my case 60000 ) for Chinese and immediately, magically the program displays Chinese characters in all the appropriate places on any computer with East Asian languages switched on. No new code is required.
The disadvantage of this method is that you probably will not receive the Chinese strings from the translator in this form. Since there are so many different ways to represent Chinese characters, this problem will probably arise no matter what scheme you choose. I have been doing this for less than a year, and already have received Chinese translations in several different formats which require some hacking about to decode. I cannot give details of all the different possibilities, but here is some general advice.
The first thing is to determine if the characters are being represented with fixed width 16 byte numbers. If they are, then you need to convert them into escaped hexadecimal ASCII character strings.
The other format that you will often see is variable width multibyte numbers, often called UTF-8. These need to be converted. Here is a straightforward manual procedure.
• Paste into notepad editor
• Clean up so that everything is as regular as possible
• Save as unicode big-endian
• Open in a hex editor
• Copy and paste the required code string into the string table file, escaping as you go.
Obviously, this procedure is only feasible for a small number of strings. If you need to automate this procedure, contact me.
[ add comment ] ( 77 views ) | permalink | ( 3 / 2095 )