Software Globalization With Windows 2000/XP Houman Pournasseh Lead Program Manager.
-
Upload
hilary-hunt -
Category
Documents
-
view
227 -
download
5
Transcript of Software Globalization With Windows 2000/XP Houman Pournasseh Lead Program Manager.
SoftwareSoftwareGlobalization With Globalization With Windows 2000/XPWindows 2000/XP
Houman PournassehLead Program Manager
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Agenda
DefinitionsDefinitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Definitions
World-Ready: Properly globalized and localizable.
Globalization:The process of designing and implementing source code so that it can accommodate any local market (locale) or script.
Localizability: Designing software code and resources such that resources can be localized for any local market (locale) without changing the source code.
Localization: The process of adapting a product (including both text and non-text elements) to meet the language, cultural, and political expectations and/or requirements of a specific local market (locale).
Users and Locales:
To define their geographical location, users set the location
To define formatting for date, time…,users set the user localeTo run legacy
applications (non-Unicode), users set the system locale To enter text in
different languages, users set the input locale
To select a UI language, users set the UI language
New to Windows XP
Nine (9) new locales added to previous list of 126. Punjabi, Gujarati, Telugu, Kannada, Kyrgyz, Mongolian (Cyrillic),
Galician, Divehi, Syriac
New Indic and Arabic scripts Gujarati, Gurmukhi, Telugu, Kannada, Syriac, Divehi
More robust font display for East Asian languages. Improved Regional Settings options. Largely improved MUI support New location (GEO) Support for GB18030
Agenda
Definitions Why invest in World-Ready products?Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Why invest in World Ready products?
Get into international market (World Wide Web era)
Create a single functionality binary to: Reduce development effort and cost Ease support and maintenance pain Sim-ship and avoid being your own competitor
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-stepGlobalization – step-by-step
Universal encoding - UnicodeUniversal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Transforms of Unicode
UTF-7: 7 bit transformation format (rare)
UTF-8 8 bit transformation format For transmission over unknown lines: e.g. Web pages Codepage number CP_UTF8 = 65001
UTF-16 and UCS-2 Microsoft uses UTF-16 little-endian as its standard for
Unicode encoding
UTF-32 and UCS-4
Windows 2000/XP:Unicode & Single Binary
Built in support for hundreds of languages
Any (well behaved) language Win32 application can run on any language version of Windows 2000/XP
Native Unicode support for new scripts
Support for supplementary characters
Unicode Encoding
Non-Unicode applications behavior depends on user’s settings and makes data exchange between OS language versions impossible.
Legacy systems support
Few exceptions for not fully Unicode apps: App has to run on Win9x and NT Existing Internet protocols and standards require
special encoding
Supporting apps that need to run on Win9x Create two separate binaries: one ANSI & one
Unicode Register as ANSI and internally convert to/from
Unicode as needed Use MSLU!
TCHARTCHAR LPTSTRLPTSTR
wchar_twchar_t charchar wchar_t *wchar_t * char *char *
For 8 bit and double-byte characters:
typedef char CHAR; // 8 bit charactertypedef char *LPSTR; // pointer to 8 bit string
For Unicode (“Wide”) characters:
typedef unsigned short WCHAR; // 16 bit charactertypedef WCHAR *LPWSTR; //pointer to 16 bit string
Data types
Win32 API prototypes
Generic function prototypes:
// winuser.h#ifdef UNICODE#define SetWindowText SetWindowTextW#else#define SetWindowText SetWindowTextA#endif // UNICODE
A routines behavior under Windows 2000/XP W routines behavior under Win9x
String manipulation functions and macros
Generic CRT 8 bit codepage Unicode
_tcscpy strcpy wcscpy_tcscmp strcmp wcscmp
Generic Win32 8 bit codepage Unicode
lstrcpy lstrcpyA lstrcpyWlstrcmp lstrcmpA lstrcmpW
Compile with –D_UNICODE to get Unicode version
Compile with –DUNICODE to get Unicode version
Text macro:#ifdef UNICODE#define TEXT(string) L#string #else#define TEXT(string) string#endif // UNICODE
Unicode ANSI
Converting between ANSI and Unicode MultiByteToWideChar for codepage Unicode WideCharToMultiByte for Unicode codepage
CP can be any legal codepage number or a predefined such as: CP_ACP, CP_SYMBOL, CP_UTF8, etc.
Tips for writing Unicode: Use generic data types and function prototypes Replace p++/p-- with CharNext/CharPrev Compute buffer sizes in TCHAR
Porting an ANSI application to Unicode
Encodings in Web pages
ANSI codepages or ISO character encodings Mono-lingual or restricted to one script
Raw Unicode: UTF-16 OK for Windows NT networks
Number entities: क OK for occasional use
UTF-8: Recommended encoding Supported by IE 4.0+ and Netscape 4.0+
Setting web encoding
HTML/DHTML: Tag in the head of the document
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=<value>">
XML:<?xml version=“1.0” encoding=<value>?>
ASP:Specify charset using ASP directives:
Per session: <%Session.CodePage=<charset>%>
Per page: <%@CODEPAGE=<charset>%>
Setting encodings for .NET
Class: System.Text
Distinction between: File, Request, and Response encodings
in code:Response.ContentEncoding=<value>
in page directive:<%@Page ResponseEncoding=<value>%>
in configuration file:<globalization
requestEncoding=<value>responseEncoding=<value> fileEncoding=<value> />
Universally encoded page
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale awareLocale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Windows 2000/XP: NLS
NLS APIs allow you to automatically adjust to users formatting preferences: Date:
07/04/01 is 平成 13 年 7 月 4 日 in Japan Time: 9:00PM is 21:00 in the France Currency: $1,000.00 is 1.000,00 $ in Germany Large Numbers:
123,456,789.00 is 12,34,56,789.00 in Hindi Sort Order:
German ä comes after a Swedish ä comes after z
Locale awareness
Eliminate implicit locale assumptions from code:
#define ToUpper(ch) \
((ch)<='Z' ? (ch) : (ch)+'A' - 'a')
Query system to format locale-dependent data using NLS APIs and LCIDs.
6 bits6 bits 10 bits4 bits12 bits
Reserved Sub-language
Sort ID Primary Language
Language ID
NLS APIs Getting and setting locales
Querying locales LCID GetSystemDefaultLCID EnumSystemLocales LCID GetUserDefaultLCID() LCID GetThreadLocale()
Setting locales BOOL SetThreadLocale(LCID dwNewLocale) BOOL SetLocaleInfo(LCID,…)
// Works for standard locales only!
No APIs to set System locale, User locale, and UI language
NLS APIs Querying locale information
To retrieve information specific to a given locale: GetLocaleInfo Gives information for any valid locale (takes an LCID). LCTYPE input tells type of info to retrieve for a given locale (e.g.
currency symbol, name of months…). Returns info in string buffer (LPTSTR).
To retrieve information specific to a location: GetGeoInfo Gives information for any valid location (takes an LCID). SYSGEOTYPE input tells type of info to retrieve for a given
location(e.g. LCID, Time zones…).
NLS APIs Formatting data
To enumerate formats: EnumCalendarInfo(Ex) EnumDateFormats EnumTimeFormats
To format data directly: GetCurrencyFormat GetDateFormat GetTimeFormat
String comparison
A locale depending comparison: lstrcmp or lstrcmpi
Locale independent comparison Win2000 & below:Locale = MAKELCID(MAKELANGID (LANG_ENGLISH,
SUBLANG_ENGLISH_US), SORT_DEFAULT); ComapreString(Locale, ..., ..., ..., ...); Windows XP:CompareString(LOCALE_INVARIANT, …, …, …, …,
…);
A locale aware application
Locales in web pages
Defaults to the user locale
Supported by IE4.x and Netscape 4.x
A server variable that can be retrieved by:Request.ServerVariables("HTTP_ACCEPT_LANGUAGE")
A property of the Navigator objectnavigator.UserLanguage
Locale awareness in web pages
To retrieve user locale: A server variable:
Request.ServerVariables("HTTP_ACCEPT_LANGUAGE") A property of the navigator object:
navigator.UserLanguage
To set a locale: In DHTML:
SetLocale("de")DateData = FormatDateTime(now(), vbShortDate)
In ASP:<% Session.LCID = 1041 %><% Response.Write( FormatDateTime(dtNow) ) %>
Locale awareness in .NET
Class: System.Globalization
Referenced as CultureInfo – set of preferences based on language and culture.Pattern: xx-XX, such as fr-CA, de-AT(RFC-1766)
Setting the CultureInfo:Implicit: Picked up from User Locale Explicit:
In code: Thread.CurrentThread.CurrentCulture = new CultureInfo (“de-DE”)
In page directive: <%@Page Culture=<value>%>In config: <globalization culture=<value> />
Locale aware web site
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methodsHandle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Handling Input methods
Easiest: Using edit controls (recommended) Responding directly to user input
Input locales (language + input method): HKL• GetKeyboardLayout• ActivateKeyboardLayout• LoadKeyboardLayout
Windows messages:• WM_INPUTLANGCHANGEREQUEST• WM_INPUTLANGCHANGE• WM_IME*.* (for IME support only)• WM_CHAR and WM_IME_CHAR
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script awareComplex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Complex Scripts have one or more of the following attributes:
Bi-directional (BiDi) reordering (Arabic, Hebrew) Contextual shaping (Arabic, Indic family) Display of combining characters (Arabic, Thai, Indic) Specialized word-breaking (Thai) Text Justification (Arabic)
Windows 2000/XP: Complex Scripts
Back
Complex Scripts BiDi reordering
Back
Complex Scripts Contextual Shaping
Back
Complex Scripts Combining Characters
Back
Complex Scripts Justification
Uniscribe
Clients: Windows 2000/XP, Trident, Microsoft Office 2000/XP
A collection of exported APIs (high and low level) Hides implementation details A shaping engine per language
USERUSERGDIGDI
LPK.LPK.DLLDLL USPUSP
ApplicationApplication
Options to display text
Plain text in application Standard edit control or Win32 API (ExtTextOut / DrawText).
Simple formatted text In Win32 apps, use Richedit control. For Web pages, use Document Object Model (DHTML).
Advanced formatting Use Uniscribe (see SDK and MSJ article).
Special considerations
When dealing with BiDi, set RTL reading order and alignment• SetTextAlign / GetTextAlign with
TA_RIGHT• ExtTextOut with ETO_RTLREADING• DrawText with DT_RTLREADING
To measure line lengths: Do not sum cached character widths Do use a GetTextExtent function or
Uniscribe
When displaying typed text: Do not output characters one at a time! Do save text in a buffer and display the
whole string with Uniscribe or Win32 API
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independencyFont independency Multi-lingual UI aware Mirroring aware
Conclusion & References
Windows 2000/XP: Font support
Introduction of OpenType fonts: • Extended TTF with glyphs for PE, ME, Thai,
Greek, Turkish, Cyrillic…
Font fallback mechanism for CS and Eastern Asian scripts used by Uniscribe
Font linking mechanism used by GDI
Font independency Win32 programming
Not to do: Hard code font face names Assume a given font is installed Assume selected font supports the desired script
To do: Use MS Shell Dlg face
name in Dialog resources EnumFontFamiliesEx or
ChooseFont to select fonts
Font independency In Web pages
Avoid placing text formatting values into in-line style.<span style = "font-size: 10pt; font-family: Arial;"> Hello </span>
Declare text style in CSS files:<style> .myStyle {font-size: 10pt; font-family: Arial;}</style>
<span class = myStyle> Hello </span>
Use WEFT to embed fonts to your web pages (IE only):
http://www.microsoft.com/typography/web/default.htm
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI awareMulti-lingual UI aware Mirroring aware
Conclusion & References
Windows 2000/XP: Multilanguage UI
Multilanguage version of Windows 2000/XP allows you to:
Switch the language of UI without rebooting
Set the language of UI per user
Add/Remove language modules
Offer your own solution for a multilingual UI
Multilingual UI Applications Possible options
One localized .exe per target language
Eng.exe Ger.exe Jpn.exe
Myapp.exe Eng Ger Jpn
Myapp.exe Eng.dll Ger.dll Jpn.dll
One multilingual language resource DLL
One resource DLL per target language
Satellite DLL
Initialize to current UI language. Windows 2000/XP: GetUserDefaultUILanguage()
Down-level platforms: See “Writing Multilingual User Interface Applications” on Globaldev.
Allow user to select UI language. Use naming convention, for example: res<LANGID>.dll
Find all resource DLLs using FindFirstFile and FindNextFile
Use LoadLibrary(Ex) to load DLL file
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring awareMirroring aware
Conclusion & References
Windows 2000/XP: Mirroring technology
To create an automatic right-to-left layout of the user interface for localized versions of bidirectional languages (Arabic and Hebrew).
Coordinate transformation
Origin (0,0) in upper RIGHT corner of window X scale factor = -1 X values increase from right to left
Default (LTR) windowDefault (LTR) windowDefault (LTR) windowDefault (LTR) window
OriginOriginOriginOrigin
Increasing xIncreasing xIncreasing xIncreasing x
00 11
Mirrored (RTL) windowMirrored (RTL) windowMirrored (RTL) windowMirrored (RTL) window
OriginOriginOriginOrigin
Increasing xIncreasing xIncreasing xIncreasing x
0011
Controlling the mirroring style
Per Process: GetProcessDefaultLayout SetProcessDefaultLayout (LAYOUT_RTL)
Per window: CreateWindowEx (WS_EX_LAYOUTRTL |
WS_EX_NOINHERITLAYOUT ) SetWindowLong
Per DC: GetLayout / SetLayout LAYOUT_BITMAPORIENTATIONPRESERVED ;
Controlling the mirroring style
Dialog Resources: Set WS_EX_LAYOUTRTL in dialog template
Message boxes: Use MB_RTLLAYOUT option
BitBlt/StretchBlt: Use NOMIRRORBITMAP flag
Mirrored bitmap! Off screen bitblt
Mirroring common issues
BiDi & mirroring in web pages
In a web context, mirroring and RTL reading order go hand-in-hand:
Using DIR attribute would: Set the “right” alignment of the text Set the right_to_left reading order of the text Mirror the page context Leave the orientation of stationary elements
To set DIR attribute: Html: <html dir=RTL> At an element level <span dir = RTL> DHTML object: document.Dir = "RTL“
Tips for BiDi web pages
Directional images:<IMG style=filter:flipH SRC=arrow.jpg >
Avoid explicit alignments: Obsolete usage of “align=left” in tables and cells
Avoid absolute positioning of elements Remember: tables get mirrored automatically, use
them for robust reversibility!
Mirrored DHTML
Agenda
Definitions Why invest in World-Ready products? Globalization – step-by-step
Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency Multi-lingual UI aware Mirroring aware
Conclusion & ReferencesConclusion & References
Final Conclusions
Benefits of investing in development of World-Ready applications are real
Windows 2000/XP eases the pain and sets the standard
The biggest task in implementing World-Ready applications is setting the designers and engineers mind-set to think GLOBAL
MSDN for latest documentation about new APIs
Developing International Software for Windows 95 and Windows NT
Windows 2000/XP Globalization: http://www.microsoft.com/globaldev
World-Ready Guide You are not World-Ready If…
E-Mail aliases:[email protected]@microsoft.com
Resources