从Java中的String中删除非ASCII字符 [英] Remove non-ASCII characters from String in Java

查看:478
本文介绍了从Java中的String中删除非ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含非ASCII字符的URI,如:

I have a URI that contains non-ASCII characters like :

http://www.abc.de/qq/qq.ww?MIval=typo3_bsl_int_Smtliste&p_smtbez=Schmalbl ttrigeSomerzischeruchtanb

http://www.abc.de/qq/qq.ww?MIval=typo3_bsl_int_Smtliste&p_smtbez=Schmalbl�ttrigeSomerzischeruchtanb

如何从此URI中删除

How can I remove "�" from this URI

推荐答案

我猜这是URL的来源更有问题。也许你正在解决错误的问题?从URI中删除奇怪字符可能会给它一个完全不同的含义。

I'm guessing that the source of the URL is more at fault. Perhaps you're fixing the wrong problem? Removing "strange" characters from a URI might give it an entirely different meaning.

话虽如此,您可以删除所有非ASCII字符简单字符串替换:

With that said, you may be able to remove all of the non-ASCII characters with a simple string replacement:

string fixed = original.replaceAll("[^\\x20-\\x7e]", "");

或者你可以将它扩展到所有非四字节UTF-8字符,如果那不是' t覆盖 字符:

Or you can extend that to all non-four-byte-UTF-8 characters if that doesn't cover the "�" character:

string fixed = original.replaceAll("[^\\u0000-\\uFFFF]", "");

这篇关于从Java中的String中删除非ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆