在Java中检测(或最好的猜测)传入字符串编码 [英] Detect (or best guess of) incoming string encoding in Java
问题描述
我知道你总是需要一些额外的元数据来告诉编码是什么,还有最佳做法等,但是我遇到的情况我需要给出最佳的近似值。
一个解决方案 - 或指针 - 在UTF-8和UTF-16之间进行编程区分也是受欢迎的。
utf-8编码应该很容易验证:
UTF-8字符串可以通过简单的启发式算法相当可靠地识别。
来自维基百科
查看这个网站查看算法
I was wondering if there are known methods to detect (or give a best guess of) the encoding of a particular string in Java.
I know that you always need some additional meta-data to tell what the encoding is, and there are best practices etc., but the situation I'm in, I need to give the best approximation.
A solution -- or a pointer -- to programatically distinguishing between UTF-8 and UTF-16 is also welcome.
The utf-8 encoding should be easy to verify:
UTF-8 strings can be fairly reliably recognized as such by a simple heuristic algorithm. from wikipedia
Take a look at this site to see the algorithm
这篇关于在Java中检测(或最好的猜测)传入字符串编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!