如何自动检测字符串编码? [英] How to auto detect a String encoding?

查看:92
本文介绍了如何自动检测字符串编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个String,其中包含以Base64之类的某种方式编码的值.

问题是我真的不知道它是否实际上是Base64(有AZ,0-9,+,/),所以它可能是我不熟悉的任何其他代码.

有没有一种方法或任何其他在线站点可以向他发送编码后的输入,并且可以告诉我它是哪个代码?

注意:我不是在问如何知道我的String是UTF-8还是iso-8859-1还是类似的东西.我需要知道我的代码是在哪编码的.

更清楚地说,

我需要一些输入,例如: 23Nzi4lUE4qlc + Pmc3blWMS1Irmgo3i8UTQHhoL7VyzqpEV/i9bDhoiteZ0a7/TqcVSkrXR89V2Yj7tEFDGJx4gvWEBs = 这是已编码的字符串.

输出应为已编码String的类型,并且其解码方式如下:

  Base64->黄色的大鱼在管子里游泳." 

也许有些程序获得输入,并尝试使用一系列编码类型(Base64等)对其进行解码.输出并不重要,因为这取决于用户决定是否良好.

解决方案

此站点处理base64 de/encoding

由于Base64只是一类编码方案的一个实例(特别是将比特流编码为base_n编号),所以您可能永远不会比仅测试几种标准编码方案更好.

您要么检查编码方案的格式是否正确,要么尝试解码,而不会出现使用Web服务或您自己的代码引发的错误.

在(可能是病理性的)情况下,给定的八位位组流将成功解码一种以上的编码方案.

最佳做法是将精力投入到建立验证上,首先使数据提供者采用一种(或几种")编码(当然,并非总是可能的).

I have a String which contains some encoded values in some way like Base64.

The problem is that I really don't know if it's actually Base64 (there are A-Z, a-z. 0-9, +, /) so it can be some any other code that i'm not familiar with.

Is there a way or any other online site to send him an encoded input and it can tell me in which code is it?

NOTE: I'm not asking how to know if my String is UTF-8 or iso-8859-1 or something like that. What I need is to know in which is my code is encoded.

EDIT:

To be more clear,

I need something to get an input like: 23Nzi4lUE4qlc+Pmc3blWMS1Irmgo3i8UTQHhoL7VyzqpEV/i9bDhoiteZ0a7/TqcVSkrXR89V2Yj7tEFDGJx4gvWEBs= this is the encoded String that I have.

The output should be the type of the encoded String and it's decoding like:

Base64 -> "Big yellow fish is swimming in the tube."

Maybe there is some program which get's an input and tries to decode it with a list of coding types (Base64 and etc.). The output doesn't really matter because it's the users decision if it's good or not.

解决方案

This site handles base64 de/encoding.

Since Base64 is just one instance of a class of encoding schemes ( specifically, encoding a bit stream as base_<n> number ), you probably will never fare better than testing for just a couple of standard encoding schemes.

You either check the well-formedness of the encoding scheme or try to decode without getting an error thrown using a web service or your own code.

In (possibly pathological) cases there will be more than one encoding scheme for which a given octet stream will successfully decode.

Best practice would be to take the effort invested into setting up the verification to committing the data provider to one (or 'a few') encoding(s) first (won't always be possible, of course).

这篇关于如何自动检测字符串编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆