转换ANSI(视窗1252)为UTF8在C# [英] Convert ANSI (Windows 1252) to UTF8 in C#

查看:422
本文介绍了转换ANSI(视窗1252)为UTF8在C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在迂回的方式之前,之前在这里对堆栈溢出问这一点,并希望得到它正确的这个时候。如何转换ANSI(codePAGE 1252)为UTF-8,而preserving特殊字符? (我知道,UTF-8支持更大的字符集,比ANSI,但它是好的,如果我能preserve是由ANSI支持所有UTF-8字符和一个替换休息吗? 或东西)

I've asked this before in a round-about manner before here on Stack Overflow, and want to get it right this time. How do I convert ANSI (Codepage 1252) to UTF-8, while preserving the special characters? (I am aware that UTF-8 supports a larger character set than ANSI, but it is okay if I can preserve all UTF-8 characters that are supported by ANSI and substitute the rest with a ? or something)

为什么我要转换ANSI和RARR; UTF-8

我基本上写一个程序,简单地将电子名片文件(VCF)到单独的文件,每个都包含一个单一的接触。我注意到,诺基亚和索尼爱立信手机中保存UTF-8的备份VCF文件(无BOM),而Android将其保存在ANSI(1252)。神知道什么格式的其他电话将它们保存在!

I am basically writing a program that splits vCard files (VCF) into individual files, each containing a single contact. I've noticed that Nokia and Sony Ericsson phones save the backup VCF file in UTF-8 (without BOM), but Android saves it in ANSI (1252). And God knows in what formats the other phones save them in!

所以,我的问题是

  1. 是不是那里的vCard文件的字符编码​​的行业标准?
  2. 在哪一个更容易为我解决我的问题? ANSI转换为UTF8(和/或其他方式轮),或者试图检测该编码输入文件具有与通知用户呢?

TL;博士 需要知道如何对字符编码的(ANSI / UTF8)转换成(UTF8 / ANSI),而preserving所有特殊字符。

tl;dr Need to know how to convert the character encoding from (ANSI / UTF8) to (UTF8 / ANSI) while preserving all special characters.

推荐答案

VCF是连接codeD中所要求通过的第一章3.4规范。你需要采取这一严重的是,格式是完全无用的,如果这是不是一成不变的。如果你看到了一些Android应用程序的mangling重音符号,然后从假设这是在该应用程序中的错误工作。或者更可能的是,它得到了坏信息从别的地方。然后,您尝试更正编码会造成的更多的问题,因为你的卡的版本永远比不上原来的。

VCF is encoded in utf-8 as demanded by the spec in chapter 3.4. You need to take this seriously, the format would be utterly useless if that wasn't cast in stone. If you are seeing some Android app mangling accented characters then work from the assumption that this is a bug in that app. Or more likely, that it got bad info from somewhere else. Your attempt to correct the encoding would then cause more problems because your version of the card will never match the original.

您转换从1252到UTF-8 Encoding.GetEncoding(1252).GetString(),传递一个的字节[] 的。千万不要尝试写code读取一个字符串并敲敲它变成一个byte [],因此您可以使用转换方法,只是使编码问题的很多的恶化。换句话说,你需要阅读的FileStream,而不是StreamReader的文件。但同样,避开固定别人的问题。

You convert from 1252 to utf-8 with Encoding.GetEncoding(1252).GetString(), passing in a byte[]. Do not ever try to write code that reads a string and whacks it into a byte[] so you can use the conversion method, that just makes the encoding problems a lot worse. In other words, you'd need to read the file with FileStream, not StreamReader. But again, avoid fixing other people's problems.

这篇关于转换ANSI(视窗1252)为UTF8在C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆