Eclipse Globalization Guidelines

From Eclipsepedia

Revision as of 13:58, 7 January 2008 by Kitlo.us.ibm.com (Talk | contribs)

Jump to: navigation, search

Contents

Eclipse Globalization Guidelines

This document is a working draft.
Kit Lo
Last Updated: January 2008
Note: Please use Babel committers mailing list to add comments instead of embedding them in this document.

Introduction

Eclipse is an open source project contributed by developers around the world. Every developer’s programming style may be a little different. Often times, developers may not aware that little things they do in their programs may have a huge impacts in globalization of Eclipse. In this document, we try to define a set of guidelines for Eclipse globalization. If these guidelines are adopted by all Eclipse developers and translators, it will lead to greater consistency and the success of globalizing Eclipse.

In this section, we will discuss the rationales behind some of the globalization guidelines. At the end of the document, a summary of the Eclipse Globalization Guidelines is listed.

Eclipse Resource Bundle Process

Standard Java resource bundles are not very memory efficient. The keys for a resource bundle are stored in the class files of the plug-in, which are saved somewhere in the JVM’s data structures. The resource bundles are loaded in whole, even when no string is ever looked up from it.

Eclipse came up with an alternate resource bundle process in release 3.1 to improve memory usage. A conversion tool was provided to migrate to the new resource bundle process. See Message Bundle Conversion Tool for more information.

All ResourceBundle file contents should be encoded in UTF-8

Different operating systems running in different language settings may save files in different encodings. To avoid confusions, we recommend all Eclipse developers and translators to encode all ResourceBundle source file contents and all translated ResourceBundle file contents in UTF-8.

GuidelineIndicator.gifGuideline 1.1

All ResourceBundle source file contents should be encoded in UTF-8.

GuidelineIndicator.gifGuideline 2.1

All translated ResourceBundle file contents should be encoded in UTF-8.

Single Quote Handling

Depending on whether the code will process the message string through the Java MessageFormat class, single quotes in ResourceBundle files should be written differently. If the string will be processed by MessageFormat, then any single quote in the string must be doubled. Otherwise, a single quote should not be doubled. Translators who do not have access to the code cannot tell if the message string would be processed by MessageFormat.

Most Java programmers follow the model of processing the message string with MessageFormat only if the message string contains replacement variables. We recommend all Eclipse developers to follow these guidelines when creating message strings and writing the code to process the message strings:

  • Strings which contain replacement variables are processed by the MessageFormat class (single quote must be coded as 2 consecutive single quotes).
  • Strings which do NOT contain replacement variables are NOT processed by the MessageFormat class (single quote must be coded as 1 single quote).

    GuidelineIndicator.gifGuideline 1.2
    Process message string with Java MessageFormat class only if the message string contains replacement variables.

    Here are examples of messages strings in ResourceBundle file and expected text displayed to end user:

  • ResourceBundle file:
    String_1 = No variable. 1 quote (') 2 (''), 3 ('''), 4 ('''').
    String_2 = Variable {1}. 2 quotes(''), 4 ('''').
  • Text displayed to end user:
    String_1 => No variable. 1 quote (') 2 (''), 3 ('''), 4 ('''').
    String_2 => Variable xxx. 2 quotes('), 4 ('').

    Non-translatable Message Strings

    Special comments may be added to force all text between the start and end tags to be non-translatable. The comments must start in column 1 and must contain this exact format:

  • ListResourceBundles:
    Start comment: // START NON-TRANSLATABLE
    End comment: // END NON-TRANSLATABLE
  • PropertyResourceBundles:
    Start comment: # START NON-TRANSLATABLE
    End comment: # END NON-TRANSLATABLE

    GuidelineIndicator.gifGuideline 1.3
    Enclose non-translatable message strings with special comments.

    Prevent Fragmentation of Messages

    Messages tend to occupy much storage space. It is only natural that developers try to economize the storage requirement by using some truly inventive techniques, such as, using English phrases or parts of words as building blocks for complete messages. However, if the final form of a message relies on the composition of several parts, that message may not be translatable at all. Translation can change the order of parts of speech, and words can acquire different forms depending on the context.

    GuidelineIndicator.gifGuideline 1.4

    Do not construct sentences from parts of sentences. Do not construct words from parts of words.

    A phrase or word that can be inserted into several English messages may require a different case, gender, or plural ending when translated into other languages. The small amount of space saved cannot compensate for the confusion and trouble caused in translation.

    Example 1: To report the operational status of the peripheral devices, a programmer creates the following two columns of terms:

    Device Status
    Display operational
    Control unit off line
    Printer busy
    Serial port defective

    By selecting one item from each column, the product can generate 16 messages by just storing 8 terms. Two of these messages would be:

  • Display operational
  • Control unit operational

    In French, however, the nouns display and control unit are of different gender, thus requiring different forms of the adjective operational:
  • Poste opérationnel
  • Unité de contrôle opérationnelle

    Example 2: In English, the word day can be prefixed with the following word fragments to form the seven days of the week:
  • Sun
  • Mon
  • Tues
  • Wednes
  • Thurs
  • Fri
  • Satur

    This technique, however, does not work in most other languages. In the table below, it will not work for Sunday in French and Wednesday in German.
    English French German
    Monday Lundi Montag
    Tuesday Mardi Dienstag
    Wednesday Mercredi Mittwoch
    Thursday Jeudi Donnerstag
    Friday Vendredi Freitag
    Saturday Samedi Samstag
    Sunday Dimanche Sonntag


    Avoid using a slash to mean "and" or "or"

    Avoid using a slash to mean "and" or "or" because it is ambiguous in English and does not exist in other languages.

    GuidelineIndicator.gifGuideline 1.5

    Do not use a slash to mean “and” or “or”.

    Avoid forming plurals by adding "(s)" to indicate either singular or plural form

    Avoid forming plurals by adding "(s)" to indicate either singular or plural form even when space is a problem (such as, in messages, headings, captions, tables, or art callouts). Other languages form the plural of a noun in many different ways. Include in your sentence either the singular form, the plural form, or both forms.

    GuidelineIndicator.gifGuideline 1.6

    Avoid forming plurals by adding "(s)" to indicate either singular or plural form.

    Avoid abbreviations, acronyms, and special symbols

    Abbreviations of words can lead to misunderstandings by the translators and by the readers. Rules for abbreviation vary from language to language. Never assume that translators can understand the meaning of your abbreviations or can abbreviate their translations similarly.

    GuidelineIndicator.gifGuideline 1.7

    Avoid abbreviations, acronyms, and special symbols.

    If you must use an abbreviation or acronym, ensure that your translator knows its exact meaning, and that you allow enough space for the expression to be spelled out fully in other languages.

    Example: North Americans use # for number, ' for feet or minute, " for inch or second, and c/o for care of. Other languages may not have such short form equivalents.

    Eclipse Globalization Guidelines

    Globalization Guidelines for Developers

    GuidelineIndicator.gifGuideline 1.1

    All ResourceBundle source file contents should be encoded in UTF-8.

    GuidelineIndicator.gifGuideline 1.2

    Process message string with Java MessageFormat class only if the message string contains replacement variables.

    GuidelineIndicator.gifGuideline 1.3

    Enclose non-translatable message strings with special comments.

    GuidelineIndicator.gifGuideline 1.4

    Do not construct sentences from parts of sentences. Do not construct words from parts of words.

    GuidelineIndicator.gifGuideline 1.5

    Do not use a slash to mean “and” or “or”.

    GuidelineIndicator.gifGuideline 1.6

    Avoid forming plurals by adding "(s)" to indicate either singular or plural form.

    GuidelineIndicator.gifGuideline 1.7

    Avoid abbreviations, acronyms, and special symbols.

    Globalization Guidelines for Translators

    GuidelineIndicator.gifGuideline 2.1

    All translated ResourceBundle file contents should be encoded in UTF-8.

    References

    [1] IBM. (2007). National Language Design Guide, Volume 1: Designing Internationalized Products (7th ed.).