Java Web应用程序i18n [英] Java web application i18n

查看:86
本文介绍了Java Web应用程序i18n的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

已经完成了使用2.3 servlet规范将i18n引入J2EE Web应用程序的任务(相当艰巨).该应用程序非常大,并且已经积极开发了8年以上.

这样,我想第一次把事情做好,所以我可以限制我花时间遍历JSP,JavaScript文件,servlet和其他地方,用消息束中的值替换硬编码的字符串. /p>

这里没有使用框架.我该如何寻求对i18n的支持.请注意,我希望每个视图只有一个JSP,可以从一个或多个属性文件中加载文本,而对于每个受支持的语言环境,则不要一个不同的JSP.

我猜我的主要问题是我是否可以在后端"中的某个位置设置语言环境(即从用户配置文件中读取语言环境并在会话中存储值),然后期望JSP页面将能够正确加载从正确的属性文件(即当语言环境为法语时,从messages_fr.properties)中指定的字符串,而不是在每个JSP中添加逻辑以查找正确的语言环境.

有什么办法可以解决这个问题吗?

解决方案

在国际化应用程序时需要注意很多事情:

区域设置检测

您需要考虑的第一件事是检测最终用户的语言环境.取决于您要支持的内容,这可能很简单,也可能有些复杂.

  1. 您肯定知道,W​​eb浏览器倾向于通过HTTP Accept-Language标头发送最终用户的首选语言.在Servlet中访问此信息可能很简单,就像调用request.getLocale()一样.如果您不打算支持任何区域设置检测"工作流程,则可以坚持使用此方法.
  2. 如果您的应用程序中有用户配置文件,则可能要向其中添加首选语言"和首选格式设置区域设置".在这种情况下,您需要在用户登录后切换区域设置.
  3. 您可能希望支持基于URL的语言切换(例如: http://deutsch.example.com/ http://example.com?lang=de ).您将需要根据URL信息设置有效的语言环境-可以通过多种方式(例如,URL过滤器)完成此操作.
  4. 您可能希望支持语言切换(从下拉菜单中选择语言,或其他方式),但是我不推荐(除非它与第3点结合使用).

如果您只想支持第一种方法,或者您不打算添加任何其他依赖项(例如Spring Framework),那么

JSTL 方法就足够了.

虽然我们使用的是 Spring Framework ,但它具有很多不错的功能,您可以同时使用它们来检测区域设置(例如 LocaleChangeInterceptor )并进行外部化字符串和格式消息(请参阅).
Spring Framework将使您能够轻松地实现上述所有方案,这就是为什么我更喜欢它.

字符串外部化

这应该很容易,对吗?好吧,大多数情况是-只需使用适当的标签即可.您可能面临的唯一问题是外部化客户端(JavaScript)文本时.有几种可能的方法,但是让我提及这两种方法:

  1. 让每个JSP编写翻译后的字符串数组(带有消息标记),并只需在客户端代码中访问该数组即可.这是较简单的方法,但维护性较差-您实际上需要从有效页面(实际上是引用您的客户端脚本的页面)中写入有效字符串.我之前已经做过,并相信我,这不是您在大型应用程序中想要做的事情(但这可能是小型应用程序的最佳解决方案).
  2. 从原则上讲,另一种方法听起来很困难,但实际上将来更容易处理.这个想法是将字符串集中在客户端(将它们移动到一些常见的JavaScript文件中).之后,您将需要实现自己的Servlet,该Servlet将根据请求返回此脚本-应该转换其内容.您将无法在这里使用JSTL,您需要直接从Resource Bundles中获取字符串.
    维护起来容易得多,因为您将有一个中心点来添加可翻译的字符串.

串联

我不想这么说,但是从Localizability的角度来看,级联确实很痛苦.它们很常见,大多数人都没有意识到.

那么串联是什么?

原则上,每个英语句子都需要翻译成目标语言.问题是,正确翻译的消息会多次使用与其英语对应词不同的词序(因此,英语安全策略"被翻译成波兰语"Politykabezpieczeństwa" –"policy"是"polityka" –次序不同).

确定,但是它与软件有什么关系?

在Web应用程序中,您可以像这样连接字符串:

String securityPolicy = "Security " + "policy";

或类似这样:

<p><span style="font-weight:bold">Security</span> policy</p>

两者都会有问题.在第一种情况下,您将需要使用MessageFormat.format()方法并将字符串外部化为(例如)"Security {0}""policy",在后一种情况下,您将对整个段落(p标记)的内容进行外部化,包括 span标签.我知道这对翻译人员来说很痛苦,但实际上没有更好的方法.
有时,您必须在段落中使用动态内容-JSTL fmt:format标记也将在这里为您提供帮助(它在后端使用了MessageFormat.)

布局

在本地化应用程序中,经常发生翻译字符串比英语更长的情况.结果看起来很丑.不知何故,您需要修复样式.再次有两种方法:

  1. 通过调整常见样式来解决问题(并祈祷它不会破坏其他语言).维护起来非常痛苦.
  2. 实施CSS本地化机制.我正在谈论的机制应该提供默认的,独立于语言的CSS文件和每种语言覆盖.想法是为每种语言设置覆盖CSS文件,以便您可以按需调整布局(仅针对一种语言).为此,默认CSS文件以及JSP页面的任何样式定义旁边均不得包含!important关键字.如果确实需要使用它,请将它们移至基于语言的en.css-这将允许其他语言对其进行修改.

文化特定问题

避免使用可能特定于西方文化的图形,颜色和声音.如果您确实需要它,请提供本地化方法.避免使用对方向敏感的图形(因为当您尝试本地化为阿拉伯语或希伯来语时会出现问题).另外,不要以为整个世界都使用相同的数字(即阿拉伯语不是这样).

日期和时区

至少可以说,用Java处理日期不是一件容易的事.如果您除了格里高利日历之外不支持其他功能,则可以坚持使用内置的日期和日历类. 您可以使用JSTL fmt:timeZone,fmt:formatDate和fmt:parseDate在JSP中正确设置时区,格式和解析日期.

我强烈建议像这样使用fmt:formatDate:

<fmt:formatDate value="${someController.somedate}" 
    timeZone="${someController.detectedTimeZone}"
    dateStyle="default" 
    timeStyle="default" />

将日期和时间隐藏到有效(最终用户)时区很重要.将其转换为易于理解的格式也很重要-这就是为什么我建议使用默认格式样式的原因. 顺便提一句.时区检测并非易事,因为Web浏览器发送任何内容都不是一件好事.相反,您可以将首选时区"字段添加到用户"偏好设置中(如果有的话),也可以通过客户端脚本从Web浏览器获取当前时区偏移量(请参见

复合消息

您已经被警告不要连接字符串.相反,您可能会使用MessgageFormat.但是,我必须指出,您应该最大程度地减少使用复合消息.那只是因为目标语法规则通常是完全不同的,所以翻译者可能不仅需要重新排列句子(这将通过使用占位符和MessageFormat.format()解决),而且还需要根据翻译内容以不同的方式翻译整个句子.被取代.让我给你举一些例子:

// Multiple plural forms
English: 4 viruses found.
Polish: Znaleziono 4 wirusy. **OR** Znaleziono 5 wirusów.

// Conjugation
English: Program encountered incorrect character | Application encountered incorrect character.
Polish: Program napotkał nieznaną literę | Aplikacja napotkała nieznaną literę.

字符编码

如果您打算本地化为不支持ISO 8859-1代码页的语言,则需要支持Unicode-最好的方法是将页面编码设置为UTF-8.我见过有人这样做:

<%@ page contentType="text/html; charset=UTF-8" %>

我必须警告您:这 不够 .您实际上需要此声明:

<%@page pageEncoding="UTF-8" %>

此外,为了安全起见,您仍然需要在页面标题中声明编码:

<META http-equiv="Content-Type" content="text/html;charset=UTF-8"> 

我给您的清单并不详尽,但这是一个很好的起点.祝你好运:)

I've been given the (rather daunting) task of introducing i18n to a J2EE web application using the 2.3 servlet specification. The application is very large and has been in active development for over 8 years.

As such, I want to get things right the first time so I can limit the amount of time I need to scrawl through JSPs, JavaScript files, servlets and wherever else, replacing hard-coded strings with values from message bundles.

There is no framework being used here. How can I approach supporting i18n. Note that I want to have a single JSP per view that loads text from (a) properties file(s) and not a different JSP for each supported locale.

I guess my main question is whether I can set the locale somewhere in the 'backend' (i.e. read locale from user profile on login and store value in session) and then expect that the JSP pages will be able to correctly load the specified string from the correct properties file (i.e. from messages_fr.properties when the locale is to French) as opposed to adding logic to find the correct locale in each JSP.

Any ideas how I can approach this?

解决方案

There are a lot of things that need to be taken care of while internationalizing application:

Locale detection

The very first thing you need to think about is to detect end-user's Locale. Depending on what you want to support it might be easy or a bit complicated.

  1. As you surely know, web browsers tend to send end-user's preferred language via HTTP Accept-Language header. Accessing this information in the Servlet might be as simple as calling request.getLocale(). If you are not planning to support any fancy Locale Detection workflow, you might just stick to this method.
  2. If you have User Profiles in your application, you might want to add Preferred Language and Preferred Formatting Locale to it. In such case you would need to switch Locale after user logs in.
  3. You might want to support URL-based language switching (for example: http://deutsch.example.com/ or http://example.com?lang=de). You would need to set valid Locale based on URL information - this could be done in various ways (i.e. URL Filter).
  4. You might want to support language switching (selecting it from drop-down menu, or something), however I would not recommend it (unless it is combined with point 3).

JSTL approach could be sufficient if you just want to support first method or if you are not planning to add any additional dependencies (like Spring Framework).

While we are at Spring Framework it has quite a few nice features that you can use both to detect Locale (like CookieLocaleResolver, AcceptHeaderLocaleResolver, SessionLocaleResolver and LocaleChangeInterceptor) and externalizing strings and formatting messages (see spring:message tab).
Spring Framework would allow you to quite easily implement all the scenarios above and that is why I prefer it.

String externalization

This is something that should be easy, right? Well, mostly it is - just use appropriate tag. The only problem you might face is when it comes to externalizing client-side (JavaScript) texts. There are several possible approaches, but let me mention these two:

  1. Have each JSP written array of translated strings (with message tag) and simply access that array in client code. This is easier approach but less maintainable - you would need to actually write valid strings from valid pages (the ones that actually reference your client-side scripts). I have done that before and believe me, this is not something you want to do in large application (but it is probably the best solution for small one).
  2. Another approach may sound hard in principle but it is actually way easier to handle in the future. The idea is to centralize strings on client side (move them to some common JavaScript file). After that you would need to implement your own Servlet that will return this script upon request - the contents should be translated. You won't be able to use JSTL here, you would need to get strings from Resource Bundles directly.
    It is much easier to maintain, because you would have one, central point to add translatable strings.

Concatenations

I hate to say that, but concatenations are really painful from Localizability perspective. They are very common and most people don't realize it.

So what is concatenation then?

On the principle, each English sentence need to be translated to target language. The problem is, it happens many times that correctly translated message uses different word order than its English counterpart (so English "Security policy" is translated to Polish "Polityka bezpieczeństwa" - "policy" is "polityka" - the order is different).

OK, but how it is related to software?

In web application you could concatenate Strings like this:

String securityPolicy = "Security " + "policy";

or like this:

<p><span style="font-weight:bold">Security</span> policy</p>

Both would be problematic. In the first case you would need to use MessageFormat.format() method and externalize strings as (for example) "Security {0}" and "policy", in the latter you would externalize the contents of the whole paragraph (p tag), including span tag. I know that this is painful for translators but there is really no better way.
Sometimes you have to use dynamic content in your paragraph - JSTL fmt:format tag will help you here as well (it works lime MessageFormat on the backend side).

Layouts

In localized application, it often happens that translated strings are way longer than English ones. The result could look very ugly. Somehow, you would need to fix styles. There are again two approaches:

  1. Fix issues as they happen by adjusting common styles (and pray that it won't break other languages). This is very painful to maintain.
  2. Implement CSS Localization Mechanism. The mechanism I am talking about should serve default, language-independent CSS file and per-language overrides. The idea is to have override CSS file for each language, so that you can adjust layouts on-demand (just for one language). In order to do that, default CSS file, as well as JSP pages must not contain !important keyword next to any style definitions. If you really have to use it, move them to language-based en.css - this would allow other languages to modify them.

Culture specific issues

Avoid using graphics, colors and sounds that might be specific for western culture. If you really need it, please provide means of Localization. Avoid direction-sensitive graphics (as this would be a problem when you try to localize to say Arabic or Hebrew). Also, do not assume that whole world is using the same numbers (i.e. not true for Arabic).

Dates and time zones

Handling dates in times in Java is to say the least not easy. If you are not going to support anything else than Gregorian Calendar, you could stick to built-in Date and Calendar classes. You can use JSTL fmt:timeZone, fmt:formatDate and fmt:parseDate to correctly set time zone, format and parse date in JSP.

I strongly suggest to use fmt:formatDate like this:

<fmt:formatDate value="${someController.somedate}" 
    timeZone="${someController.detectedTimeZone}"
    dateStyle="default" 
    timeStyle="default" />

It is important to covert date and time to valid (end user's) time zone. Also it is quite important to convert it to easily understandable format - that is why I recommend default formatting style.
BTW. Time zone detection is not something easy, as web browsers are not so nice to send anything. Instead, you can either add preferred time zone field to User preferences (if you have one) or get current time zone offset from web browser via client side script (see Date object's methods)

Numbers and currencies

Numbers as well as currencies should be converted to local format. It is done in the similar way to formatting dates (parsing is also done similarly):

<fmt:formatNumber value="1.21" type="currency"/> 

Compound messages

You already have been warned not to concatenate strings. Instead you would probably use MessgageFormat. However, I must state that you should minimize use of compound messages. That is just because target grammar rules are quite commonly different, so translators might need not only to re-order the sentence (this would be resolved by using placeholders and MessageFormat.format()), but translate the whole sentence in different way based on what will be substituted. Let me give you some examples:

// Multiple plural forms
English: 4 viruses found.
Polish: Znaleziono 4 wirusy. **OR** Znaleziono 5 wirusów.

// Conjugation
English: Program encountered incorrect character | Application encountered incorrect character.
Polish: Program napotkał nieznaną literę | Aplikacja napotkała nieznaną literę.

Character encoding

If you are planning to Localize into languages that does not support ISO 8859-1 code page, you would need to support Unicode - the best way is to set page encoding to UTF-8. I have seen people doing it like this:

<%@ page contentType="text/html; charset=UTF-8" %>

I must warn you: this is not enough. You actually need this declaration:

<%@page pageEncoding="UTF-8" %>

Also, you would still need to declare encoding in the page header, just to be on the safe side:

<META http-equiv="Content-Type" content="text/html;charset=UTF-8"> 

The list I gave you is not exhaustive but this is good starting point. Good luck :)

这篇关于Java Web应用程序i18n的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆