如何在Java中将UTF-8转换为US-Ascii [英] How to convert UTF-8 to US-Ascii in Java

查看:528
本文介绍了如何在Java中将UTF-8转换为US-Ascii的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个系统,客户,主要是欧洲的输入文本(UTF-8)必须分发到不同的系统,其中大多数接受UTF-8,但现在我们还必须将文本分发到美国系统。只接受US-Ascii 7位

We have a system where customers, mainly European enter texts (in UTF-8) that has to be distributed to different systems, most of them accepting UTF-8, but now we must also distribute the texts to a US system which only accepts US-Ascii 7-bit

所以现在我们需要将所有欧洲字符翻译成最近的US-Ascii。是否有任何Java库可以帮助完成这项任务?

So now we'll need to translate all European characters to the nearest US-Ascii. Is there any Java libraries to help with this task?

现在我们刚刚开始添加到转换表,其中Å(瑞典AA) - > A等等在我们找不到输入字符的任何匹配的地方和地方,我们将记录它并用问号替换并尝试修复它以用于下一个版本,但它看起来非常低效并且其他人必须先做类似的事情之前。

Right now we've just started adding to a translation table, where Å (swedish AA)->A and so on and where we don't find any match for an entered character, we'll log it and replace with a question mark and try and fix that for the next release, but it seems very inefficient and somebody else must have done something similair before.

推荐答案

uni2ascii 程序是用C语言编写的,但您可以轻松地将其转换为Java。它包含一个大的近似表(隐含地,在switch-case语句中)。

The uni2ascii program is written in C, but you could probably convert it to Java with little effort. It contains a large table of approximations (implicitly, in the switch-case statements).

请注意,没有普遍接受的近似值:德国人希望你用Ä替换Ä AE,Finns和瑞典人更喜欢A.你的Å的例子也不明显:瑞典人可能会放弃戒指并使用A,但丹麦人和挪威人可能更喜欢历史上更正确的AA。

Be aware that there are no universally accepted approximations: Germans want you to replace Ä by AE, Finns and Swedes prefer just A. Your example of Å isn't obvious either: Swedes would probably just drop the ring and use A, but Danes and Norwegians might like the historically more correct AA better.

这篇关于如何在Java中将UTF-8转换为US-Ascii的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆