Uni Code Systems

27
7/21/2019 Uni Code Systems http://slidepdf.com/reader/full/uni-code-systems 1/27 Unicode SAP Systems NW F Internationalization U n i c o d e @ s a p

Transcript of Uni Code Systems

Page 1: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 1/27

Uni co de SAP Syst ems

N W F In t e r n a t i o n a l i z a t i o n

U n

i c o

d e

@ s a p

Page 2: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 2/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 2

© Copyright 2008 SAP AG. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.

Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.

Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.

IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400,iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, and Informix aretrademarks or registered trademarks of IBM Corporation in the United States and/or other countries.

Oracle is a registered trademark of Oracle Corporation.

UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.

Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.

HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide WebConsortium, Massachusetts Institute of Technology.

Java is a registered trademark of Sun Microsystems, Inc.

JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented andimplemented by Netscape.

MaxDB is a trademark of MySQL AB, Sweden.

SAP, R/3, mySAP, mySAP.com, xApps, xApp, and other SAP products and services mentioned herein as well astheir respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National productspecifications may vary.

These materials are subject to change without notice. These materials are provided by SAP AG and its affiliatedcompanies ("SAP Group") for informational purposes only, without representation or warranty of any kind, andSAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAPGroup products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

Documentation on SAP Service Marketplace

You can find this documentation at the following address:http://service.sap.com/Unicode@sap

Page 3: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 3/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 3

Icons in Body Text

Icon Meaning

Caution

Example

Note

Recommendation

Syntax

Background

Additional icons are used in SAP Library documentation to help you identify different types of information at a glance. For more information, see Help on Help General Information Classes andInformation Classes for Business Information Warehouse on the first page of any version of SAPLibrary .

Typographic Conventions

Page 4: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 4/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 4

Introduction.........................................................................................................................................6What is Unicode? ............................................................................................................................6Technical Language Support ...........................................................................................................7Language Configurations in SAP Systems.......................................................................................2

Basic Information ................................................................................................................................3Programming Languages....................................................................................................................3

ABAP Programs .......................................................................................................................3C and C++ Programs................................................................................................................4XML .........................................................................................................................................4Java .........................................................................................................................................5

Unicode Character Encodings and Byte Length...............................................................................5Database and Platform Support ...................................................................................................6Hardware Requirements: Non-Unicode - Unicode compared........................................................7

Frontend Settings in the Unicode System ........................................................................................9SAP GUI Support .....................................................................................................................9

Frontend Requirements................................................................................................................9Code Pages ...........................................................................................................................10Font Selection ........................................................................................................................10

Locales ......................................................................................................................................10Input Methods in Unicode Systems.........................................................................................10International Components for Unicode (ICU)...........................................................................10

Unicode SAP Systems: After the System Installation .....................................................................11Language Configuration.............................................................................................................11

Enter languages .....................................................................................................................12Enter country..........................................................................................................................13Simulate configuration............................................................................................................13Activate configuration and perform updates ............................................................................13Logon Language ....................................................................................................................14Translation Import...................................................................................................................14

Printing in Unicode SAP Systems...............................................................................................14Supported Device Types.........................................................................................................14

Communication within Multilingual System Landscapes.................................................................14Data Transfer in a Unicode/non-Unicode System Landscape..................................................15Transport between Unicode and non-Unicode SAP Systems ..................................................17RFC Library............................................................................................................................17

From non-Unicode SAP System to Unicode ...............................................................................18Appendix.......................................................................................................................................18

Documentation...........................................................................................................................18

New Installation of Unicode SAP Systems ..............................................................................18Conversion of non-Unicode SAP to Unicode...........................................................................18Combined Upgrade & Unicode Conversion.............................................................................18

Page 5: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 5/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 5

Further Information.....................................................................................................................19Contacts ....................................................................................................................................19

Page 6: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 6/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 6

IntroductionWith SAP NetWeaver™, Unicode is the solution for multilingual SAP systems. This document isintended for managers and consultants who require detailed information about technical requirements,language availability and import, the integration process of Unicode Systems in an existing SAPSystem landscapes and ways to your Unicode SAP System.

This document is constantly being revised. Please make sure that you always use the most currentversion which can be downloaded from SAP Note 73606.

What is Unicode?Unicode (and the parallel ISO 10646 standard) defines the character set necessary for efficientlyprocessing text in any language and for maintaining text data integrity. In addition to global character coverage, the Unicode standard is unique among character set standards because it also defines dataand algorithms for efficient and consistent text processing. This enables high-level processing andensures that all conformant software produces the same results. The widespread adoption of Unicodeover the last decade made text data truly portable and formed a cornerstone of the Internet.What is the Business Value?

Globalized software, based on Unicode, maximizes market reach and minimizes cost. Globalizedsoftware is built and installed once and yet handles text for and from users worldwide andaccommodates their cultural conventions. It minimizes cost by eliminating per-language builds,installations, and maintenance updates.

Who needs Unicode?

Computer users who deal with multilingual text -- business people, linguists, researchers, scientists,and others - will find that the Unicode Standard greatly simplifies their work. Mathematicians and

technicians, who regularly use mathematical symbols and other technical characters, will also find theUnicode Standard valuable.Global business processes, for example global HR system or global Master Data Management, WebServices offering customers to enter their contact data (Global Master Data containing multiple locallanguage characters!), in short: Global Business requires the support of a Global Character Set!

What are the benefits o f a Unicode-based SAP System?

Internet and Web Services The Internet (including the World Wide Web) – and therefore collaborative scenarios – are

based on Unicode. Unicode SAP systems take full advantage of XML and Java (both of which require Unicode).

Unicode is required for cross-application data exchange without loss of data caused byincompatible character sets. One way to present documents on the World Wide Web, for example, is XML.

Unicode compliant ABAP paves the way for more efficient and effective integration of ABAPand Java applications.

Business Value Unicode SAP systems can be more tightly integrated with non-SAP products and offer a

superior platform for collaborative, cross-system business applications. Unicode enables SAP customers to install global systems that cover their business processes

worldwide.

Companies using different distributed systems frequently want to aggregate their worldwidecorporate data. Without Unicode, their ability to do this is limited.Language Support

Page 7: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 7/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 7

Unicode SAP systems provide unrestricted use of any language or language combination inthe world.

With Unicode, you can use multiple languages simultaneously on a single frontend. In Unicode SAP systems you can display and maintain data which has been entered in ANY

language with ANY logon language - provided the language installation has been performed

correctly as described in chap. Language Configuration . The logon language onlydetermines the display language for menus, dynpros, system messages etc…i.e. thelanguage the user is working in.

Technical Language SupportFig.1: Example collection of languages which are supported in Unicode SAP systems

The languages are sorted by their 2-letter ISO 639 codes.

Language support

As of release R/3 1I non-Unicode : 41 languages which have a 2-letter language key according toISO 639-1 (= Default Set of Languages: see Fig. 2 )

As of Web Application Server 6.20Unicode :

a) Default Set of Languages

41 language codes (see Table below)b) New Unicode Languages:

433 additional languages which have a 3-letter language key according to ISO 639-2 (seeexample 1. below)

86 languages which have no separate ISO 639 language key but are assigned to countries or scripts (see examples 2a) and b) below)

Page 8: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 8/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 8

All languages from ISO 639-1and ISO 639-2 standard and 86 languages with no ISO-language key(560 languages in total) can be entered and displayed in Unicode SAP systems with the followingreleases:

Web AS 6.20 Unicode Support Package 54 and higher Note: Currently a maximum of 50 different languages can be activated and used per system.

Web AS 6.40 Unicode Support Package 14 and higher Note: Currently a maximum of 50 different languages can be activated and used per system.

Web AS 7.00 Unicode and higher Note: Currently a maximum of 200 different languages can be activated and used per system.

SAP Note 895560 provides an overview of printing and display restrictions.For requirements regarding R3trans versions refer to SAP Note 80727.

Note:

Language support in non-Unicode systems as of Web AS 6.20 onwards is limited to the Default Set of 41 languages!

Fig. 2: Default Set

SAP/ISO Lang. Key Language

AF Afrikaans

AR* Arabic*

BG Bulgarian

CA CatalanZH Chinese

ZF Chinese trad.

HR Croatian

CS Czech

DA Danish

NL Dutch

EN EnglishET Estonian

FI Finnish

FR French

DE German

EL Greek

HE Hebrew

HU HungarianIS Icelandic

ID Indonesian

IT Italian

SAP/ISO Lang. Key Language

JA Japanese

KO KoreanLV Latvian

LT Lithuanian

MS Malayan

NO Norwegian

PL Polish

PT Portuguese

Z1 Reserved- cust.RO Romanian

RU Russian

SR Serbian

SH Serbian (Latin)

SK Slovakian

SL Slovene

ES SpanishSV Swedish

Page 9: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 9/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 2

TH Thai

TR Turkish

UK Ukrainian

*in Unicode systems only

Examples: New Unicode Languages

1. Languages wi th 3-letter language key acco rding to ISO 639-2:

Hindi = ISO ‘HIN’ = SAP ’HI’

Iranian = ISO ‘IRA’ = SAP ’IR’

Sanskrit = ISO ‘SAN’ = SAP ‘SA’

2. Languages with no separate language key according to ISO 639

a) Assigned to countries:

English Australia = ISO ‘ENG’ = SAP '1E'

English Canada = ISO ‘ENG’ = SAP '3E'

English Ireland = ISO ‘ENG’ = SAP '8E'

English New Zealand = ISO ‘ENG’ = SAP '1N'b) Assigned to scripts:

Azerbaijani (Cyrillic) = ISO ‘AZE’ = SAP ‘5R’

Azerbaijani (Latin) = ISO ‘AZE’ = SAP ‘AZ’

For an overview of all supported languages (including technical details, used scripts, and countries)see the Excel sheet "Supported Languages and Code Pages.xls". You can download this documentfrom SAP Note 73606 and from www.service.sap.com/i18n i18n Media Library.

Translation Status

SAP delivers the mySAP ERP editions in 30 of the languages listed in Fig.2 . For an overview seewww.service.sap.com/languages . Here you will find availability figures per release, translation leveland delivery status of each language. However, if you need additional languages with 2 or 3-letter language codes, you can install them in your Unicode SAP system as described in chap. LanguageInstallation, but remember that translation is not delivered by SAP.

Language Configurations in SAP SystemsOne of the important considerations when preparing to install an SAP System is the choice of thesystem language(s). A multinational company operating in different countries all over the world usuallyneeds several languages consisting of many different characters.Characters are encoded on a per script basis. So, for example, there is only one set of Latincharacters defined, despite the fact that the Latin script is used for the alphabets of thousands of different languages. The same principle applies for any other script (Cyrillic, Arabic, Ethiopic,Devanagari, etc.) which is used for writing many different languages.

Page 10: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 10/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 2

Range of the Unicode Standard

The Unicode standard and ISO/IEC 10646 support three encoding forms (UTF-8, UTF-16 and UTF-32)that use a common repertoire of characters and allow for encoding as many as 1.1 million characters.This is sufficient for all known character encoding requirements, including full coverage of all historicscripts of the world, as well as common notational systems.The Unicode Standard defines codes for characters used in all the major languages written today.Scripts include the European alphabetic scripts, Middle Eastern right-to-left scripts, and many scriptsof Asia. It covers the GB 18030-2000 Standard as well.The Unicode Standard further includes punctuation marks, diacritics, mathematical symbols, technicalsymbols, arrows, dingbats, etc. It provides codes for diacritics, which are modifying character markssuch as the tilde (~), that are used in conjunction with base characters to represent accented letters (ñ,for example). In all, the Unicode Standard, Version 4.0 provides codes for 95,221 characters from theworld's alphabets, ideograph sets, and symbol collections."

Fig. 3

ASCIIGeneral Scrip ts

Symbols

CJK Ideographs

Hangul

Compatibility

Surrogate Area

65,000 charact ers

Addi tional1,000,000 characters

ASCIIGeneral Scrip ts

Symbols

CJK Ideographs

Hangul

Compatibility

Surrogate Area

65,000 charact ers

Addi tional1,000,000 characters

ASCIIGeneral Scrip ts

Symbols

CJK Ideographs

Hangul

Compatibility

Surrogate Area

65,000 charact ers

Addi tional1,000,000 characters

General Scrip ts

Symbols

CJK Ideographs

Hangul

Compatibility

Surrogate Area

65,000 charact ers

Addi tional1,000,000 characters

Page 11: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 11/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 3

Basic InformationProgramming LanguagesPrior to Web AS 6.20, SAP used different encodings to process characters from different scripts - suchas ASCII, EBCDIC, or double-byte code pages. These character sets covered every language used inSAP. However, problems occurred when you tried to mix texts from different incompatible character sets in a central system. Exchanging data between systems with incompatible character sets also leadto contingent situations.The solution to this problem is to use a code consisting of all the characters used throughout the world,i.e. Unicode (ISO/IEC 10646), which consists of at least 16 bit = 2 bytes, alternatively of 32 bit = 4bytes per character.

ABAP ProgramsMost programs should work without any modification, but you need to ensure that all programs complywith the stricter Unicode 6.10 syntax and semantics, which improve program efficiency and enableUnicode support. Note that all programs must be 6.10 compliant to run in a Unicode system and 6.10compliant programs will also run in a non-Unicode system as well. In a non-Unicode system, programsdo not have to be 6.10 compliant.To check your program, use the transaction UCCHECK to determine if your programs are ABAP 6.10compliant; In addition, programs should be tested to catch non-static errors that appear at run-time.Use the transaction SCOV to monitor the testing.The language adjustments made as part of the conversion to Unicode provide an excellent opportunityfor all ABAP developers to clean up their source code. The new programming statements also work innon-Unicode programs. SAP strongly recommends using them there in order to improve thereadability and minimize ambiguity in source code.UCCHECK

Run UCCHECK and enter the programs you want to check.

After you have completed the check, and modified any code that was not ABAP 6.10 compliant, youshould check the runtime behavior of your programs. UCCHECK issues errors for static detectablesyntax errors, or warnings where runtime errors are possible, that cannot be detected by the staticsyntax check.Coverage Analyzer

With the Coverage Analyzer you can montor the code coverage of program executions in your SAPsystems. The Coverage Analyzer enables you to check the completeness of runtime tests and todisplay the results. It shows the collected data in several different customizable hierarchies.You candrill down to the programs and to their respective modularization units. The information that can bedisplayed includes for example:

Degree of utilization

Percentage of program units that have been tested

Percentage of programs whose Unicode check flag is active

Number of processing blocks

Unicode Enhanced Syntax Check

The system profile parameter abap/ uni code_check=on can be used to enforce the enhancedsyntax check for all objects in non-Unicode systems. When setting this parameter, only Unicode-enabled objects (objects with the Unicode flag) are executable. Note that after setting the Unicodeflag, automatically generated programs might need to be regenerated. The mentioned parameter should be set to the value "on" only, if all customer programs have been enabled according totransaction UCCHECK.

Page 12: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 12/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 4

The same ABAP source code is used for both Unicode and non-Unicode installations (see Fig. 3).Therefore you can freely combine Unicode and non-Unicode mySAP components and you takeadvantage of new releases with or without Unicode. Enhancements have also been made to RFC toguarantee smooth data communication between Unicode and non-Unicode systems and with non-SAP products as well. Because most mySAP solutions consist of cross-component integrationscenarios, thorough integration testing is also planned, in particular within a combined Unicode / non-

Unicode system environment.

There are separate Unicode and non-Unicode versions of R/3:

Fig. 3 Character expansion model

No explicit Unicode data type in ABAP

Single ABAP source fo r Unicode and non-Unicode systems

Automatic conversi on of charact er data fo r communicat ion between Uni code and non-Unicode systems

C and C++ ProgramsBoth the non-Unicode (single byte or multibyte) SAP kernel and the Unicode SAP kernel are compiledfrom a single set of C program sources -- when porting the SAP system to Unicode, a huge amount of C programming source were enhanced, but this does not affect the non-Unicode SAP kernel. Your C/C++ programs must also be enhanced to run in a Unicode system (If you use C/C++ programs, theymust also be Unicode-enabled (see RFC-Documentation on SAP Service Marketplace). Go toservice.sap.com/rfc-library. Select Media Library RFC Library Guide.

XMLUnicode SAP systems take full advantage of XML.SAP Exchange Infrastructure (SAP XI), SAP's platform for process integration based on the exchangeof XML messages is only available as of Web AS 6.20 Unicode. It provides a technical infrastructurefor XML-based message exchange in order to connect SAP components with each other, as well aswith non-SAP components.For detailed information about SAP XI, see www.service.sap.com/xi.For information about SAP Web Services see http://service.sap.com/uddi

For information about SAP Internet Business Framework see http://www.service.sap.com/netweaver .For more information about XML-development at SAP, see www.service.sap.com/xml.

Page 13: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 13/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 5

JavaUnicode SAP systems take full advantage of Java.For information about Java development at SAP see www.service.sap.com/j2ee.

Unicode Character Encodings and Byte LengthEach Unicode character has a unique Unicode scalar value, and there are three "UnicodeTransformation Formats" (UTF), which are mathematical permutations of the Unicode scalar value.There are currently 8-, 16- and 32- bit encodings: UTF-8, UTF-16, UTF-32, as well as other encodingschemes for Unicode characters, such as CESU-8.

Although there are multiple encoding schemes, this is not the same as the problem of multiple codepages. All UTF formats, as well as CESU-8, contain exactly the same character set. Thetransformations between Unicode encodings are done algorithmically, and therefore no conversiontables are needed; this improves performance considerable.To provide the most efficient balance between memory requirements, performance, and compatibilitywith existing non-Unicode systems, SAP uses different encoding schemes.

Fig. 4: Encoding schemes used in SAP systems

UTF- 8 (8 bit encoding) UTF-16 (16 bit encoding)

variable length; 1 character = 1-4 bytes fixed length; 1 character = 2 bytes(surrogate pairs = 2+2 bytes)

platform independent, byte order independent

platform-dependent byte order (Little/Big Endian)

no alignment restriction 2 bytes alignment restriction

UTF-8 is the character encoding usedfor XML Best compromise between memoryusage and algorithmic complexity.

E x t e r n a l c o m m u n i c a t i o n

( F r o n t e n d ; G

U I )

all 7 bit ASCII characters have the samecode points and byte length

this ensures compatibility with non-Unicode systems

fits to Java and Microsoftenvironment

best way to migrate existing ABAPand C programs

I n t er n

al

c omm

uni c

a t i on

( a p pl i c

a t i on

s er v er )

Fig. 5: Example: Representation of Unicode Characters in SAP systems

Character

Unicode Scalar Value

UTF-8 / CESU-8

UTF-16

[Little Endian]

UTF-16

[Big Endian]

A U+0041 41 41 00 00 41

Ä U+00C4 C3 84 C4 00 00 C4

U+03B1 CE B1 B1 03 03 B1

U+05D0 D7 90 D0 05 05 D0

U+6653 E6 99 93 53 66 66 53

Fig. 6: Database Format: varies depending on the manufacturer

Page 14: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 14/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 6

UTF-8 CESU-8 UTF-16

DB/2 (AIX) OracleMax DB (8.0)

SQL Server DB/2 (AS400) SAP DB (7.0)

What is Little Endian/Big Endian?The byte order of UTF-16 depends on the processor architecture (i.e. is byte order/"endian" -dependent).

Little Endian (LE)

The least significant byte of the number is stored in memory at the lowest address, and the mostsignificant byte at the highest address. (The little end comes first.) As an analogy, we say "fourteen" inEnglish; the less significant number, four, comes first.This is also called least significant byte (LSB) ordering.

Big Endian (BE)

The most significant byte is stored in memory at the lowest address, and the least significant byte atthe highest address. (The big end comes first.) As an analogy, we say "twenty-four" in English; themore significant number, twenty, comes first.This is also called most significant byte (MSB) ordering.

When converting to Unicode, the export code page should correspond to the endianness of the targetsystem.

Database and Platform SupportThe following operating system/database combinations are supported.

Informix and Reliant Unix support is not planned.Fig. 10

Operating SystemDatabase

W2K Linux³ Solaris1 HP1 Tru641 AIX1 OS/400 OS/390

Oracle - -

MS SQL

Server

- - - - - - -

SAPDB/MaxDB

- -

DB2 -²

1 64bit versions only.

Page 15: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 15/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 7

Hardware Requirements: Non-Unicode - Unicode com pared

OverviewThe Unicode encoding determines the length of a character. A character in one of the Unicodeencodings can be more than 1 byte, and therefore Unicode characters can be longer than characters

defined in other standard code pages. This leads to larger hardware demands.The CPU/RAM figures below are measured average numb ers on SAP Application Servers. They willbe different for different transactions. Additional CPU/RAM hardware resource requirements onstandalone servers must be provided by DB vendors.

Fig. 7

*The ABAP statement SET_LOCALE which is required for the processing of language-dependent data in MDMP systems isCPU expensive. In a single code page/Unicode system it is exclusively used for sorting and therefore not CPU expensive. Widecharacter handling is expensive for double-byte code pages.

Database Size

The expected size of additional hardware which is required for a Unicode database depends on: Database Unicode encoding scheme (e.g. CESU-8 vs. UTF-16)

Database settings (page size, extent size)

z/OS: Hardware compression (reduces size by approx. 40%)

Languages in use: e.g. for double-byte characters UTF-16 requires less storage than UTF-8which means that processing Japanese text data demands more space than processingEnglish text data which means (see Fig. 8 )

Application modules in use (ratios: tables/indices, text/binary data)

Reorganization frequency :

o Unicode conversion includes a DB reorganizationo DB growth is often compensated by shrinking due to reorganizationi (especially the

indices)

*

Page 16: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 16/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 8

Fig. 8

1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16

A Ä

1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16

A Ä

1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16

A Ä

1100 8000 CESU-8 UTF-161100 8000 CESU-8 UTF-16

Hardware Requirements – Database

Fig. 11

Database (Platform) Encoding Scheme Addit. Storage Requirements

DB2 for AS/400 UTF-16 10…20% *

DB2 for z/OS UTF-16 -20…10%**

DB2 /Universal Database for Unix/NT UTF-8 -10%

MaxDB UTF-16 40…60%

MS SQL UTF-16 40…60%

Oracle CESU-8 -10%*Small growth as biggest part of the ASCII based database is already Unicode

**SAP Unicode installations on z/OS always use hardware compression which overcompensates the growth dueto Unicode

Average database growth measured in customer systems (sum of all sizes):UTF-8 and CESU-8: -13% (more than 90% of the databases have shrunk )UTF-16: +30...40%

Fig. 9: Examples

DB Size beforeConversion (inGB)

DB Size after Conversion (inGB)

Change DBManufacturer

UnicodeEncoding

54 48 -11,1% Oracle CESU-8

528 461 - 12,7% Oracle CESU-8

772 666 - 13,7% Oracle CESU-8

112 93 -17% Oracle CESU-8

880 674 - 23,4% Oracle CESU-8

240 270 + 12,5% Oracle CESU-8

Page 17: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 17/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 9

460 650 + 41,3% SAP DB 7.3 UTF-16

22 36 + 63,6% SQL UTF-16

If you run a database which is not supported by Unicode, you can perform a database change withsimultaneous Unicode conversion. Choose the heterogeneous system copy procedure for thedatabase export/import instead of the homogeneous system copy method. For more information aboutsystem copy methods see SAP Service Marketplace Quick Link /systemcopy.

Frontend Settings in the Unicode SystemSAP GUI Suppor tAll SAP GUIs (HTML, Java, Windows) support Unicode alongside all the non-Unicode code pagesalready supported. Because SAP GUI is backward compatible, a single SAP GUI can be used toaccess both Unicode and non-Unicode systems, and therefore only one GUI is needed per frontend.

Frontend RequirementsRequirements : SAP recommends installation of the newest SAP GUI Patch

Level.Strongly Recommended: SAP GUI for Windows 6.40

If you use SAP GUI for Windows 6.20, the minimum PatchLevel is 56.

Documentation : “SAPGUI for Windows: I18N User Guide”: You can download thisdocumentation from SAP Service Marketplace atwww.service.sap.com/i18n I18N Media Library or from SAP Note508854SAP Note 710720 (SAPGUI for Windows 6.40)

For full support of languages with multi-byte system locales (Japanese, Traditional Chinese, SimplifiedChinese and Korean) SAPGUI 6.40 is required!

Appl ication-specifi c restrict ions

As of SAP NetWeaver 2004s all BEx tools are Unicode-enabled. For details have a look atwww.service.sap.com/bi SAP NetWeaver 2004s BI BI Capabilities.

Prerequisites

1. Unicode version of SAP Web Application Server (at least kernel patch level 1078)2. SAPGUI 620 (at least patch level 33)

3. Windows 2000, XP and the succeeding versions of Windows4. I18N mode of SAP Frontend must be ON

Page 18: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 18/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 10

Code PagesUnicode SAP systems use one system code page and one frontend code page (GUI code page). Thesystem code page depends on the platform byte order:

4103 (UTF-16 LE) 4102 (UTF-16 BE)

The frontend code page is 4110 (UTF-8; Unicode Character Set). In Unicode systems the applicationserver sends the information about the frontend code page, and the frontend code page isautomatically set to 4110 (UTF-8).Do not change these settings!

Font SelectionIt is recommended to choose a Microsoft TrueType font, such as "Courier New" or "Andale Mono"(included in the Internet Explorer) as fixed font. In this case, "Tahoma" is automatically selected asproportional font. This combination covers a large area of the Unicode Character Set.Note:

"Arial Monospaced for SAP" is not suitable as it covers only Latin-1 characters.

To display and change the frontend settings in the Unicode System, select from the SystemFunction Bar and choose:

1. Font (I18N)…: On the first screen the fixed font is displayed (for example "Courier New".. If you choose the OK button, a second, identical, screen is opened in which the proportional fontis displayed (for example "Tahoma").

2. Options (I18N)…Read the “SAPGUI for Windows: I18N User Guide” before maintaining the frontend settings!

LocalesABAP programs are written to be language-neutral and all language-specific data is derived from thesystem locales. Unlike in non-Unicode SAP systems, the Unicode system locales are platformindependent. To provide Unicode Locales, SAP uses the International Components for Unicode (ICU).The ICU is a C/C++ and Java library that provides many internationalization functions including locales,transliteration and language-sensitive collation.

Input Methods in Unicode SystemsSAP does not support input methods which make use of characters in the Unicode Private Use Area(PUA), for example Hong Kong Chinese (HKSCS) characters. For more information, see SAP Note845233.If you want to use HKSCS characters in a Unicode system, read SAP Note 1146910 for informationabout supported input methods and device types.

International Components for Unicode (ICU)Background and History of ICU

ICU is a set of C/C++ and Java libraries for Unicode support, software internationalization andglobalization (i18n/g11n). It grew out of the JDK 1.1 internationalization APIs, which the ICU teamcontributed, and the project continues to be developed for the most advanced Unicode/i18n support.

Page 19: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 19/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 11

ICU is widely portable and gives applications the same results on all platforms and between C/C++and Java software.ICU in the Unicode Development Project

SAP Unicode development uses ICU for those functions that were defined by the platform-dependentlocales in non-Unicode systems:

1. CTYPE functions:toupperU(), tolowerU(), isupperU(), islowerU(), isspaceU(), isprintU() etc. In the Unicode system, thesefunctions are no longer platform-dependent or locale-dependent. Every Unicode character has welldefined properties, and these properties are accessible via ICU.2. Collation:The ABAP statements SORT ... AS TEXT and CONVERT ... INTO SORTABLE CODE provide theprogrammer with locale-dependent, cultural sorting. ICU has collations for all languages that aresupported by SAP.

The bidirectional layout of Hebrew and Arabic texts will be implemented with ICU both in the Unicodeand non-Unicode system.

Unicode SAP Systems: After the System InstallationRequirements : SAP_BASIS 6.20 and higher Programs : transaction I18N; transaction SMLT

Further Information : SAP Notes 73606, 544623, 551344; Excel list "SupportedLanguages and Code Pages.xls".

In a Unicode SAP system you can use almost every script and therefore almost every language in theworld. An overview of the major languages, their scripts and the countries in which they are spoken isavailable in the Excel file "Supported Languages and Code Pages" which can be downloaded fromSAP Note 73606 and from SAP Service Marketplace Quick Link /i18n I18N Media Library.Before installing a Unicode system you must consider the following topics:

1. Will you need additional hardware? See chapter Hardware requi rements .

2. Do your ABAP programs comply with ABAP 6.10 syntax? Use transaction UCCHECK to findthe lines that must be modified. Modify any program that is not compliant. For all programs setthe attribute "Unicode enabled". See chapter Programming Languages .

3. If you have any C/C++ programs, are they Unicode-compliant?4. Which languages will be required?

Consider all of the users who will be working in the system and determine which users need to work intheir respective languages. Also consider languages that are necessary for you to conduct business.

Language ConfigurationAfter determining which languages are required open transaction I18N and choose I18N Customizing

I18N System Configuration .

This application is used for configuring languages in SAP systems. The I18N System Configurationautomatically determines the settings required for a consistent i18n configuration, based on the set of languages selected. You can check all important i18n configuration tables, all important i18napplication server profile parameters, and you can update the necessary database tables using theI18N System Configuration. Modifications to the application profile parameters, however, must becarried out manually.

You will get a message that this is a Unicode system. Confirm the message, then select and followthe instructions described there.

Page 20: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 20/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 12

Web AS 6.20 and Web AS 6.40 : You can activate up to 50 languages per system.

Web AS 7.00 and higher : You can activate up to 200 languages per system.You can always check your current language configuration with Current NLS config .

Use Simulate to check the configuration and possible inconsistencies. You can simulate thesettings as often as required. Do not activate before all inconsistencies have been checked andadjusted!

Fig. 12: I18n System Configu ration

Enter languagesThe tool suggests the set of languages which is entered in database table TCP0I. If TCP0I is empty, itsuggests language EN (English) only. If you want to add languages, use Add. You can add alllanguages from the Default Set delivered by SAP (see Fig. 2 ).If you want to add more languages or the F4 help in the Key field does not include the language keyyou require, you can add the language key(s) to the F4 help list. Select Extend Language List . Thisfunction extends the Default Language Set (as described in chapter Technical Language Support , Fig.2) with any of the new Unicode Languages (chapter Technical Language Support , examples 1 and 2).Continue with with the extended list as usual.

Page 21: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 21/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 13

Enter country

If there is an existing entry, it displays the country in the fields next to Choose and also all possiblecountries in the list. In case of a new system installation or if no previous entry is found, it just displays'Unicode' in both the field and the list. If there is a need to select the country, you can make thecountry list available by selecting Goto Select Country (Unicode) from the menubar.

Simulate configuration

Select Simulate from the toolbar.

Now the consistency of new configurations is checked and a list of new setting information andproblematic areas is generated, such as obsolete parameters, etc. which must be adjusted manuallybefore the configuration can be activated. No database updates are carried out within this mode.Review the output and make the necessary changes to the profile parameters if indicated any.

Activate configurat ion and per form updates

Select Activate to complete the language configuration.

Manually update the profile parameter zcsa/installed_languages so that it will include all of the newlyadded languages. In the simulation mode ( Fig. 13 ) you see an example of new parameter values after the configuration of two additional languages. Copy the new values and go to transaction RZ 10 or RZ11. Replace the values of zcsa/installed_languages by pasting the copied values onto the currentvalues of this parameter.

Fig. 13: Language Configuration Simulation

Page 22: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 22/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 14

Logon LanguageIn Unicode SAP systems you can display and maintain data from ANY language with ANY logonlanguage - provided the languages and locales have been installed and profile parameters have beenmaintained correctly as described in chap. Language Configuration .

The logon language determines the display language for menus, dynpros, system messages etc…i.e.the language the user is working in.

Translation ImportAfter having finished all code page related configurations, run transaction SMLT to configure, import or supplement the translations for all required languages. Call transaction SMLT and read thedocumentation which is available via pushbutton Documentation in the toolbar.

Printing in Unicode SAP SystemsLocal printing with Single Code Page printers is still possible. To print multilingual data, Unicodeprinters are required.

Fig. 14

Suppor ted Device TypesLEXUTF8HPUTF8

Communication wi thin Multil ingual System LandscapesIn this chapter you will see how Unicode SAP systems integrate with other Unicode and with non-Unicode systems. You will also learn about the advantages of a homogeneous Unicode systemlandscape. The example shows that not each non-Unicode system can correctly deal with the textdata it receives - whilst each Unicode system can. So, to be truly able to communicate without anycode page incompatibilit ies, the only solution i s a system landscape which is completelyUnicode-based .Important SAP Notes

Page 23: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 23/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 15

SAP Note Description

745030 MDMP - Unicode Interfaces: SolutionOverview

Provides a detailed overview of data exchangesolutions between MDMP systems and Unicode.

656350 Master Data Transfer UNICODE <==> MDMPSystems with ALE

Details the description and solution of a specialdevelopment project which allows transferringlanguage dependent master data between aUNICODE and MDMP system using ALETechnology.

820419 IDoc adapter: Incorrect field values with MDMPsystemsInformation about using the ExchangeInfrastructure (XI3.0) for sending IDocs with theIDOC adapter to an MDMP system.

A multinational company has a Unicode SAP system at its headquarters in the US, a singlecode page SAP system in Japan (Shift-JIS or SJIS) and an MDMP SAP system in Australia(Latin-1/SJIS/Thai).Japanese and English (7Bit ASCII) data can be sent and received by all offices, but theJapanese office cannot receive all data with Thai characters from the Australian office,because SJIS does not contain those characters.

Data Transfer in a Unicode/non-Unicode System LandscapeData transfer between two Unicode systems is always unproblematic, no matter if text data containlanguage information or not.Communication between two systems will be problematic in the following cases:1. Sender and receiver system deploy different code pages. If you logon to a Unicode system with

language EN and maintain Japanese data which are then transferred (for example via RFC) or transported into a non-Unicode System which has no Japanese code page, the Japanese data willbe corrupted in the non-Unicode System.

2. JAVA applications communicate with non-Unicode SAP components: as JAVA is using Unicodefor text procssing and Unicode is a superset to all old non-Unicode code pages there is alwaysdanger of data loss in the communication between JAVA and non-Unicode software. This appliesfor the communication between the ABAP stack and the Java stack within a NetWeaver application as well as for the communication of the ABAP stack with external JAVA applications.

3. A Unicode system communicates with an Asian non-Unicode system (double-byte code page).

Fig. 15

100% data transfer

Page 24: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 24/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 16

7Bit ASCII data transfer; solution for 100% data transfer being implemented in someapplications7Bit ASCII and some additional data transfer 100%7Bit ASCII data transfer onlyNo solution for data transfer yet - under investigation!

Smooth data transfer can only be guaranteed between Unicode systems and between Unicodesystem and JAVA application (see: Fig. 15 ). As MDMP is completely unknown in the JAVA world,there is no communication possible between MDMP system and JAVA applications.

Fig. 16 System Communication

Sender / Single Code Page MDMP JAVA Application UnicodeReceiver

Single Code PageMDMPJAVA ApplicationUnicode

Fig. 17 Example: Communication between Single Code Page SAP Systems

Sender* / ISO-1 ISO-2 ISO-3 ISO-5 ISO-6 ISO-7 ISO-9 ISO-11 SJIS Big 5 KSC GBReceiver 5601 1324

ISO- 1ISO- 2ISO- 3ISO- 5ISO- 6ISO- 7ISO- 9ISO- 11SJISBig 5KSC5601GB1324

*ISO-X = ISO 8859-X

Fig. 18 Common errors:communication of Unicode system with non-Unicode system (reason: wrong language key)file upload/download (reason: wrong code page)

Page 25: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 25/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 17

Transpor t between Unicode and non-Unicode SAP SystemsTransporting objects between UC and non-UC systems is technically supported. There are somerestrictions, however, which are described in SAP Notes 638357 and 80727.

RFC Lib raryAs of Web AS 6.10 RFC enhancements guarantee smooth data communication between Unicode andnon-Unicode systems and with non-SAP products as well. All necessary data conversions occur in the

Unicode system; therefore you do not need to make any changes to your existing non-Unicodesystems. When configuring destinations in the Unicode systems, you simply have to declare the RFCdestination as a non-Unicode system, and then the data will be converted with the appropriate codepages and language keys of the non-Unicode destination system.The RFC Library exists in a Unicode and non-Unicode version. Thus, the Unicode RFC Library isforward and backward compatible, i. e. a current Unicode RFC Application can communicate with anynon-Unicode RFC Application independently of its release and vice versa.The Unicode Library is able to communicate with any RFC partner, regardless if the partner is Unicodeor non-Unicode. There are two approaches of the RFC Library:1. Both RFC partner and RFC client use a Unicode (or non-Unicode) system. The data will be sent

to the RFC server as it is, i.e there is no conversion for character-like data at sender side. The

receiver converts the data into its own internal format. Note: This RFC-connection does only work100% if sender and receiver system do not deploy different code pages; i.e. if they are both either Unicode systems (see Fig. 15 ) or they use the same non-Unicode code page (see Fig. 16 )!

2. Only one RFC partner uses a Unicode system. In this case the Unicode system must convert thedata into a suitable ASCII data format before sending it. When the RFC converts text databetween Unicode and MDMP systems it converts from/to the code page in which the text data areencoded in the MDMP system. The encoding code page depends on the text language which istaken from the language field of the table. If the table has no language field the matching codepage will be determined according to the logon language. For example if the logon language isJapanese, the Unicode partner will convert the character-like data into a 8000 code page beforesending it. This code page is called communication code page . For more information aboutRFC-connections between Unicode and non-Unicode systems read the following SAP Notes:

547444 (RFC Enhancement for Unicode ./. MDMP Connections)480671 (The Text Language Flag of LANG Fields)722193 (RFC legacy MDMP callers and Unicode callees)

Page 26: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 26/27

SupportedlanguagesinUnicode.doc 13.05.2008

SAP AG 18

647495 (RFC for Unicode ./. MDMP Connections790485 (RFC Problem Single Code Page, non-Unicode to Unicode System)

For information about how to use the RFC Library see the RFC-Documentation on SAP ServiceMarketplace. Go to service.sap.com/rfc-library. Select Media Library RFC Library Guide .

From non-Unicode SAP System to UnicodeThere are four ways to your Unicode SAP system:

1. New installation of a Unicode SAP system (as of mySAP ERP 2005/SAP NetWeaver 2004s allnew installations are Unicode systems)

2. Conversion of non-Unicode SAP system to Unicode (supported as of Web AS 6.20)3. Combined Upgrade & Unicode Conversion to target release SAP ERP 2005

SAP_BASIS 4.6C 6.20 and 6.40: general availablility (read SAP Note928729 for

restrictions)4. Twin Upgrade & Unicode Conversion

Release independent: general availability (read SAP Note 959698 for restrictions)A system conversion from MDMP to Unicode implies some additional consideration and morepreparation and conversion steps than a Single Code Page system conversion. Make sure you deployUnicode-based mySAP components (listed in SAP Note 79991) and a Unicode-enabled database (for current information of databases supported by Unicode, see section Database and Platform Support inthis document and SAP Note 379940). Make also sure you read the applicable documentation listed inthe Appendix.

Appendix

Documentation

New Installation of Unicode SAP SystemsIf you have decided to install a new Unicode SAP system, you need the Installation Guide for your database/platform combination, and SAP Note 544623.

Conversion of non-Unicode SAP to Unicode

If you have decided to convert existing SAP systems to Unicode, the following documentation isrequired:

1. "Unicode Conversion Guide". Go to SAP Service Marketplace Quick Link /[email protected] Unicode Library Unicode Conversion Library .

2. "Homogeneous or Heterogeneous System Copy for SAP Systems Based on SAP Web AS 6.xx".Go to SAP Service Marketplace Quick Link /instguides.

Combined Upgrade & Unicode Conversion

Download the Combined Upgrade & Unicode Conversion Guide 4.6C from SAP Note928729.Download the Component Upgrade Guide and the Installation Guide for your database/platformcombination from SAP Service Marketplace Quick Link/instguides.

Page 27: Uni Code Systems

7/21/2019 Uni Code Systems

http://slidepdf.com/reader/full/uni-code-systems 27/27

SupportedlanguagesinUnicode.doc 13.05.2008

Download the System Copy Guide from SAP Service Marketplace Quick Link/systemcopy.

Further InformationVisit the websites of the SAP NW Internationalization & Printing team for detailed technical information:http://service.sap.com/unicode@saphttp://service.sap.com/outputhttps://www.sdn.sap.com/irj/sdn/i18n

Visit the website of the SAP Globalization team for detailed customer and project information:www.service.sap.com/unicodeFor general information about Unicode visit the public website of the Unicode Consortium:http://www.unicode.org

Overview of pre-Unicode code pages:http://czyborra.com/charsets

ContactsFor information about Unicode Conversion project status: mailto:[email protected]