C#中:通过循环编码 [英] C#: Cycle through encodings

查看:162
本文介绍了C#中:通过循环编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读各种格式和语言的文件,我目前使用的是小型编码库采取尝试检测正确的编码(的 http://www.codeproject.com/KB/recipes/DetectEncoding.aspx )。

I am reading files in various formats and languages and I am currently using a small encoding library to take attempt to detect the proper encoding (http://www.codeproject.com/KB/recipes/DetectEncoding.aspx).

这是相当不错的,但它仍然惦记偶尔。 (多语言文件)

It's pretty good, but it still misses occasionally. (Multilingual files)

我的大部分潜在用户有编码(我希望是最好的它是与字符)的了解甚少,而且不太可能能够选择在列表中正确的编码,所以我想,让他们通过不同的编码循环,直到右边一个是刚刚通过点击一个按钮找到。

Most of my potential users have very little understanding of encoding (the best I can hope for is "it has something to do with characters") and are very unlikely to be able to choose the right encoding in a list, so I would like to let them cycle through different encodings until the right one is found just by clicking on a button.

显示的问题?点击此处尝试不同的编码! (嗯,这概念反正)

Display problems? Click here to try a different encoding! (Well that's the concept anyway)

什么是实现类似的东西的最佳方式是什么?

What would be the best way to implement something like that?

< HR />

编辑:貌似我没有表达自己不够清楚。通过通过编码循环,我的意思不是如何循环通过编码?

Looks like I didn't express myself clearly enough. By "cycling through the encoding", I don't mean "how to loop through encodings?"

我的意思是如何让用户请按顺序尝试不同的编码无需重新加载文件?

What I meant was "how to let the user try different encodings in sequence without reloading the file?"

这个想法更像是这样的:假设文件加载了错误的编码。显示一些奇怪的字符。用户可以点击一个按钮下一步编码或以前的编码,字符串将在一个不同的编码进行转换。用户只需要不断地点击到正确的编码被找到。 (不论编码看起来很不错,为用户会做得很好)。只要用户可以点击下一步,他解决他的问题的一个合理的机会。

The idea is more like this: Let's say the file is loaded with the wrong encoding. Some strange characters are displayed. The user would click a button "Next encoding" or "previous encoding", and the string would be converted in a different encoding. The user just need to keep clicking until the right encoding is found. (whatever encoding looks good for the user will do fine). As long as the user can click "next", he has a reasonable chance of solving his problem.

我已经找到迄今为止涉及到字符串转换为使用字节当前的编码,然后转换成字节到下一个编码,这些转换成字节字符,然后将字符转换成字符串...可行的,但我不知道如果没有做到这一点更简单的方法。

What I have found so far involves converting the string to bytes using the current encoding, then converting the bytes to the next encoding, converting those bytes into chars, then converting the char into a string... Doable, but I wonder if there isn't an easier way to do that.

举例来说,如果有,将读出的字符串,并使用不同的编码返回它的方法,像呈现(字符串,编码)

For instance, if there was a method that would read a string and return it using a different encoding, something like "render(string, encoding)".


非常感谢您的答案!

推荐答案

读取文件的字节,然后使用Encoding.GetString方法。

Read the file as bytes and use then the Encoding.GetString Method.

        byte[] data = System.IO.File.ReadAllBytes(path);

        Console.WriteLine(Encoding.UTF8.GetString(data));
        Console.WriteLine(Encoding.UTF7.GetString(data));
        Console.WriteLine(Encoding.ASCII.GetString(data));



所以,你必须只加载一次文件。可以使用基于文件的原始字节每编码。用户可以选择一个正确UND可以使用Encoding.GetEncoding(...)的结果。GetString的(数据),以便进一步处理。

So you have to load the file only one time. You can use every encoding based on the original bytes of the file. The user can select the correct one und you can use the result of Encoding.GetEncoding(...).GetString(data) for further processing.

这篇关于C#中:通过循环编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆