您项目中的国际化 [英] Internationalization in your projects

查看:245
本文介绍了您项目中的国际化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您如何在实际工作中实现国际化(i18n)?

在阅读Joel着名的文章后,我对使软件跨文化感兴趣.绝对最低要求每个软件开发人员绝对,肯定必须了解Unicode和字符集(无借口!).但是,除了确保在可能的情况下使用Unicode字符串之外,我还不能在实际项目中利用此功能.但是,将所有字符串都设为Unicode并确保您了解使用的所有工作都采用什么编码,这只是i18n冰山一角.

到目前为止,我所做的所有工作都已由一组受控制的美国英语使用者使用,或者说i18n并不是我们在开始实施该项目之前就没有时间进行的工作.因此,我正在寻找人们关于使软件在实际项目中更加本地化的任何提示或战争故事.

解决方案

已经有一段时间了,所以并不全面.

字符集

Unicode很棒,但是您不能不忽略其他字符集. Windows XP(英语)上的默认字符集是Cp1252.在网络上,您不知道浏览器将向您发送什么信息(尽管希望您的容器可以处理大部分操作).而且,无论您使用的是什么实现方式中的错误,都不要感到惊讶.当字符集在计算机之间移动时,它们可以与文件名进行有趣的交互.

翻译字符串

通常来说,翻译员不是编码员.如果您将源文件发送给翻译者,他们将破坏该文件.字符串应提取到资源文件中(例如Java中的属性文件或Visual C ++中的资源DLL).应该为翻译人员提供不易破解的文件,以及不让其破解的工具.

翻译者不知道字符串在哪里来自产品.没有上下文就很难翻译字符串.如果您不提供指导,翻译质量将会受到影响.

关于上下文,您可能会多次看到相同的字符串"foo",并认为将UI中的所有实例都指向同一资源会更有效.这是一个坏主意.在某些语言中,单词可能对上下文非常敏感.

翻译字符串要花钱.如果发布产品的新版本,则恢复旧版本是有意义的.有工具可以从旧资源文件中恢复字符串.

字符串连接和字符串的手动操作应最小化.在适当的地方使用格式功能.

翻译者需要能够修改热键. Ctrl + P 用英语打印;德国人使用 Ctrl + D .

如果您的翻译过程需要有人随时手动剪切和粘贴字符串,那么您就麻烦了.

日期,时间,日历,货币,数字格式,时区

这些内容因国家/地区而异.逗号可以用来表示小数位.时间可能以24小时表示法.并非每个人都使用公历.您也需要明确.如果您在网站上小心地将日期显示为美国的MM/DD/YYYY和英国的DD/MM/YYYY,则除非用户知道您已经这样做,否则日期是不明确的.

特别是货币

类库中提供的Locale函数将为您提供本地货币符号,但您不能仅将磅(英镑)或欧元符号粘贴在以美元表示价格的值之前.

用户界面

布局应该是动态的.不仅字符串的翻译长度可能加倍,而且整个UI可能需要颠倒(希伯来语;阿拉伯语),以便控件从右到左运行.那是在我们到达亚洲之前.

在翻译之前进行测试

  • 对代码进行静态分析以查找问题.至少要利用IDE内置的工具. (Eclipse用户可以转到窗口">首选项">"Java">编译器">错误/警告,并检查未外部化的字符串.)
  • 通过模拟翻译进行烟熏测试.解析资源文件并用伪翻译版本替换字符串并不困难,伪翻译版本将长度加倍并插入一些时髦的字符.您无需使用某种语言即可使用外部操作系统.现代系统应允许您以具有翻译后的字符串和外部语言环境的外部用户身份登录.如果您熟悉操作系统,则可以在不知道该语言的任何单词的情况下弄清楚该做什么.
  • 键盘映射和字符集引用非常有用.
  • 在这里虚拟化将非常有用.

非技术问题

有时候,您必须对文化差异敏感(可能会引起冒犯或不理解).您经常看到的一个错误是使用标志作为选择网站语言或地理位置的视觉提示.除非您希望您的软件在全球政治中宣布立场,否则这是个坏主意.如果您是法国人,并且提供了带有圣乔治旗的英语选项(英格兰的旗帜在白场上是一个红叉),这可能会引起许多英语使用者的困惑-假设外语和国家/地区也会出现类似的问题.需要对图标进行文化相关性审查.竖起大拇指或绿色勾号是什么意思?语言应该相对中立-以一种特定的方式向用户讲话在一个区域中可能是可以接受的,但在另一个区域则被认为是粗鲁的.

资源

C ++和Java程序员可能会发现ICU网站很有用: http://www.icu-project.org/

How have you implemented Internationalization (i18n) in actual projects you've worked on?

I took an interest in making software cross-cultural after I read the famous post by Joel, The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). However, I have yet to able to take advantage of this in a real project, besides making sure I used Unicode strings where possible. But making all your strings Unicode and ensuring you understand what encoding everything you work with is in is just the tip of the i18n iceberg.

Everything I have worked on to date has been for use by a controlled set of US English speaking people, or i18n just wasn't something we had time to work on before pushing the project live. So I am looking for any tips or war stories people have about making software more localized in real world projects.

解决方案

It has been a while, so this is not comprehensive.

Character Sets

Unicode is great, but you can't get away with ignoring other character sets. The default character set on Windows XP (English) is Cp1252. On the web, you don't know what a browser will send you (though hopefully your container will handle most of this). And don't be surprised when there are bugs in whatever implementation you are using. Character sets can have interesting interactions with filenames when they move to between machines.

Translating Strings

Translators are, generally speaking, not coders. If you send a source file to a translator, they will break it. Strings should be extracted to resource files (e.g. properties files in Java or resource DLLs in Visual C++). Translators should be given files that are difficult to break and tools that don't let them break them.

Translators do not know where strings come from in a product. It is difficult to translate a string without context. If you do not provide guidance, the quality of the translation will suffer.

While on the subject of context, you may see the same string "foo" crop up in multiple times and think it would be more efficient to have all instances in the UI point to the same resource. This is a bad idea. Words may be very context-sensitive in some languages.

Translating strings costs money. If you release a new version of a product, it makes sense to recover the old versions. Have tools to recover strings from your old resource files.

String concatenation and manual manipulation of strings should be minimized. Use the format functions where applicable.

Translators need to be able to modify hotkeys. Ctrl+P is print in English; the Germans use Ctrl+D.

If you have a translation process that requires someone to manually cut and paste strings at any time, you are asking for trouble.

Dates, Times, Calendars, Currency, Number Formats, Time Zones

These can all vary from country to country. A comma may be used to denote decimal places. Times may be in 24hour notation. Not everyone uses the Gregorian calendar. You need to be unambiguous, too. If you take care to display dates as MM/DD/YYYY for the USA and DD/MM/YYYY for the UK on your website, the dates are ambiguous unless the user knows you've done it.

Especially Currency

The Locale functions provided in the class libraries will give you the local currency symbol, but you can't just stick a pound (sterling) or euro symbol in front of a value that gives a price in dollars.

User Interfaces

Layout should be dynamic. Not only are strings likely to double in length on translation, the entire UI may need to be inverted (Hebrew; Arabic) so that the controls run from right to left. And that is before we get to Asia.

Testing Prior To Translation

  • Use static analysis of your code to locate problems. At a bare minimum, leverage the tools built into your IDE. (Eclipse users can go to Window > Preferences > Java > Compiler > Errors/Warnings and check for non-externalised strings.)
  • Smoke test by simulating translation. It isn't difficult to parse a resource file and replace strings with a pseudo-translated version that doubles the length and inserts funky characters. You don't have to speak a language to use a foreign operating system. Modern systems should let you log in as a foreign user with translated strings and foreign locale. If you are familiar with your OS, you can figure out what does what without knowing a single word of the language.
  • Keyboard maps and character set references are very useful.
  • Virtualisation would be very useful here.

Non-technical Issues

Sometimes you have to be sensitive to cultural differences (offence or incomprehension may result). A mistake you often see is the use of flags as a visual cue choosing a website language or geography. Unless you want your software to declare sides in global politics, this is a bad idea. If you were French and offered the option for English with St. George's flag (the flag of England is a red cross on a white field), this might result in confusion for many English speakers - assume similar issues will arise with foreign languages and countries. Icons need to be vetted for cultural relevance. What does a thumbs-up or a green tick mean? Language should be relatively neutral - addressing users in a particular manner may be acceptable in one region, but considered rude in another.

Resources

C++ and Java programmers may find the ICU website useful: http://www.icu-project.org/

这篇关于您项目中的国际化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆