我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？ [英] How do I convert from a possibly Windows 1252 'ANSI' encoded uploaded file to UTF8 in .NET?

查看：140 发布时间：2016/6/7 21:22:35 c# asp.net vb.net unicode

本文介绍了我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有在ASP.NET网页，用来上传文件中的文件上传控制，其内容（在流）在被处理后面的C＃code和输出页面上后，使用 HtmlEn code 。

I've got a FileUpload control in an ASP.NET web page which is used to upload a file, the contents of which (in a stream) are processed in the C# code behind and output on the page later, using HtmlEncode.

不过，一些这方面的输出正在成为错位，特别是符号'£'是作为统一code FFFD替换字符输出。我跟踪下来到输入文件，它是Windows 1252（'ANSI'）EN codeD。

But, some of this output is becoming mangled, specifically the symbol '£' is output as the Unicode FFFD REPLACEMENT CHARACTER. I've tracked this down to the input file, which is Windows 1252 ('ANSI') encoded.

现在的问题是，

我如何确定该文件是否为EN codeD为1252或UTF8？它可以是，以及

How do I determine whether the file is encoded as 1252 or UTF8? It could be either, and

我如何将其转换为UTF8如果是在Windows 1252，preserving£等？符号

How do I convert it to UTF8 if it is in Windows 1252, preserving the symbol £ etc?

我在网上看了，但找不到满意的答案。

I've looked online but cannot find a satisfactory answer.

推荐答案

如果您知道该文件是带的Windows 1252 codeD，你可以打开一个StreamReader文件，并通过适当的编码。这就是：

If you know that the file is encoded with Windows 1252, you can open the file with a StreamReader and pass the proper encoding. That is:

StreamReader reader = new StreamReader("filename", Encoding.GetEncoding("Windows-1252"), true);

真实的告诉它来设置基于字节顺序标记编码在文件的前面，如果他们在那里。否则，它会打开它作为Windows的1252。

The "true" tells it to set the encoding based on the byte order marks at the front of the file, if they're there. Otherwise it opens it as Windows-1252.

您就可以读取该文件，如果你要转换为UTF-8，写信给你已经与endcoding打开的文件。

You can then read the file and, if you want to convert to UTF-8, write to a file that you've opened with that endcoding.

简短的回答你的第一个问题是，没有确定文件的编码100％满意的方式。如果存在字节顺序标记，你可以决定的Uni $ C $的什么味道c那么它是，但没有BOM，你坚持使用启发式，以确定编码。

The short answer to your first question is that there isn't a 100% satisfactory way to determine the encoding of a file. If there are byte order marks, you can determine what flavor of Unicode it is, but without the BOM, you're stuck with using heuristics to determine the encoding.

我没有为启发式很好的参考。你可能会搜索如何确定记事本的字符集。我记得看到一些关于前一段时间。

I don't have a good reference for the heuristics. You might search for "how does Notepad determine the character set". I recall seeing something about that some time ago.

在实践中，我发现以下为大多数我做什么工作：

In practice, I've found the following to work for most of what I do:

StreamReader reader = new StreamReader("filename", Encoding.Default, true);

大多数我读的文件是那些我使用.NET的StreamWriter的创造，而且他们在UTF-8的BOM。我得到通常写有一些工具，不理解的Uni code或code页面，我只是把它当作一个字节流，这Encoding.Default做的好。其他的文件

Most of the files I read are those that I create with .NET's StreamWriter, and they're in UTF-8 with the BOM. Other files that I get are typically written with some tool that doesn't understand Unicode or code pages, and I just treat it as a stream of bytes, which Encoding.Default does well.

这篇关于我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？ [英] How do I convert from a possibly Windows 1252 'ANSI' encoded uploaded file to UTF8 in .NET?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？ [英] How do I convert from a possibly Windows 1252 &#39;ANSI&#39; encoded uploaded file to UTF8 in .NET?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

我如何从一个可能的Windows 1252'的ANSI'转换连接codeD上传文件UTF8在.NET？ [英] How do I convert from a possibly Windows 1252 'ANSI' encoded uploaded file to UTF8 in .NET?

登录关闭