Round Trip Safety Configuration of Shift-JIS Characters

Shift-JIS is a character encoding system for Japanese characters. It is equivalent to ASCII, a character encoding system for English characters.

Native Encoding and Unicode

Because Shift-JIS and ASCII both define characters for one language, they are native encoding systems. Unicode is a character encoding system that defines characters for all languages. Because software is used in a global, multilingual environment, characters for processing by computers must often be converted between native encoding systems and Unicode.

Round Trip Safety

Issues associated with conversions between native encoding systems and Unicode are referred to as round trip safety issues.

Using Unicode, applications are developed that can handle input from different languages at the same time. Input data, which is entered by users or retrieved from databases, may contain characters encoded in a native encoding system. For example, in Microsoft Windows operating system, English characters input by a user are encoded using Windows-1252.

When an application receives characters in a native encoding system, it converts the characters into Unicode for processing. After the processing is finished, the characters may be converted back into the native encoding system.

In most cases, the characters are converted without ambiguity because each native character is mapped to a single Unicode character. If the conversion of a native language character to and from Unicode results in the original character, the character is considered round trip safe.

For example, the character "A" is round trip safe in Windows-1252, as follows:

  • The Windows-1252 character for "A" is 0x41.
  • It converts to Unicode U+0041.
  • No other Windows-1252 character converts to the same Unicode character, so it always converts back to 0x41.

Issues Specific to Shift-JIS

Although the characters from most native character encoding systems are round trip safe, the Shift-JIS encoding system is an exception. Approximately 400 characters in Shift-JIS are not round trip safe because multiple characters in this group can be mapped to the same Unicode character. For example, the Shift-JIS characters 0x8790 and 0x81e0 both convert to the Unicode character U+2252.

IBM Cognos BI and Shift-JIS

IBM® Cognos® Business Intelligence uses Unicode. The round trip safety of characters is essential to ensure the accuracy of data in generated reports.

The Round Trip Safety Configuration utility ensures the round trip safety of Shift-JIS characters only when it is used both to convert characters:

  • from Shift-JIS to Unicode
  • from Unicode to Shift-JIS

If data is requested from a database that has its own automatic mechanism for Shift-JIS to Unicode conversion, IBM Cognos BI does not call the Round Trip Safety Configuration utility to convert the characters from Unicode to Shift-JIS. The round trip safety of characters in the data cannot be ensured.

For more information on The Round Trip Safety Configuration utility, see The Round Trip Safety Configuration Utility.