在Java中检测(或最好的猜测)传入字符串编码 [英] Detect (or best guess of) incoming string encoding in Java

查看:174
本文介绍了在Java中检测(或最好的猜测)传入字符串编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有已知的方法来检测(或给出最好的猜测)Java中特定字符串的编码。



我知道你总是需要一些额外的元数据来告诉编码是什么,还有最佳做法等,但是我遇到的情况我需要给出最佳的近似值。



一个解决方案 - 或指针 - 在UTF-8和UTF-16之间进行编程区分也是受欢迎的。

解决方案

utf-8编码应该很容易验证:


UTF-8字符串可以通过简单的启发式算法相当可靠地识别。
来自维基百科




查看这个网站查看算法


I was wondering if there are known methods to detect (or give a best guess of) the encoding of a particular string in Java.

I know that you always need some additional meta-data to tell what the encoding is, and there are best practices etc., but the situation I'm in, I need to give the best approximation.

A solution -- or a pointer -- to programatically distinguishing between UTF-8 and UTF-16 is also welcome.

解决方案

The utf-8 encoding should be easy to verify:

UTF-8 strings can be fairly reliably recognized as such by a simple heuristic algorithm. from wikipedia

Take a look at this site to see the algorithm

这篇关于在Java中检测(或最好的猜测)传入字符串编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆