Microsoft Excel中的.csv文件中的变音符号? [英] Microsoft Excel mangles Diacritics in .csv files?

查看:277
本文介绍了Microsoft Excel中的.csv文件中的变音符号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我以编程方式将数据(使用PHP 5.2)导出到.csv测试文件中。

示例数据:Numéro1 )。
数据是 utf-8 (没有预定的BOM)。

I am programmatically exporting data (using PHP 5.2) into a .csv test file.
Example data: Numéro 1 (note the accented e). The data is utf-8 (no prepended BOM).

当我打开这个文件MS Excel显示为NumÃro 1

When I open this file in MS Excel is displays as Numéro 1.

我可以在文本编辑器),可以正确显示。 UE报告字符十进制233

I am able to open this in a text editor (UltraEdit) which displays it correctly. UE reports the character is decimal 233.

如何导出文本 a .csv文件,以便 MS Excel将正确呈现,最好不强制使用导入向导或非默认向导设置?

How can I export text data in a .csv file so that MS Excel will correctly render it, preferably without forcing the use of the import wizard, or non-default wizard settings?

推荐答案

格式正确的UTF8文件可以有字节顺序标记作为其前三个八位字节。这些是十六进制值0xEF,0xBB,0xBF。这些字节用于将文件标记为UTF8(因为它们与字节顺序信息不相关)。 1 如果此BOM不存在,则消费者/读者仍然推断文本的编码类型。不支持UTF8的读取器将像其他一些编码(如Windows-1252)一样读取字节,并在文件开头显示

A correctly formatted UTF8 file can have a Byte Order Mark as its first three octets. These are the hex values 0xEF, 0xBB, 0xBF. These octets serve to mark the file as UTF8 (since they are not relevant as "byte order" information).1 If this BOM does not exist, the consumer/reader is left to infer the encoding type of the text. Readers that are not UTF8 capable will read the bytes as some other encoding such as Windows-1252 and display the characters  at the start of the file.

有一个已知的错误,在通过文件关联打开UTF8 CSV文件时,Excel假定它们是单字节编码,忽略 UTF8 BOM。这可以由任何系统默认代码页或语言设置修复。 BOM不会在Excel中提示 - 它只是不工作。 (少数报告声称BOM有时会触发导入文本向导。)此错误似乎存在于Excel 2003及更早版本中。

There is a known bug where Excel, upon opening UTF8 CSV files via file association, assumes that they are in a single-byte encoding, disregarding the presence of the UTF8 BOM. This can not be fixed by any system default codepage or language setting. The BOM will not clue in Excel - it just won't work. (A minority report claims that the BOM sometimes triggers the "Import Text" wizard.) This bug appears to exist in Excel 2003 and earlier. Most reports (amidst the answers here) say that this is fixed in Excel 2007 and newer.

请注意,您始终*正确打开UTF8使用导入文本向导在Excel中的CSV文件,它允许您指定您打开的文件的编码。当然,这是不太方便的。

Note that you can always* correctly open UTF8 CSV files in Excel using the "Import Text" wizard, which allows you to specify the encoding of the file you're opening. Of course this is much less convenient.

这个答案的读者很可能是在他们不特别支持Excel& 2007,但是发送原始的UTF8文本到Excel,这是误解它,并用Ã和其他类似的Windows-1252字符洒你的文本。 添加UTF8 BOM可能是您最好的和最快速的解决方案。

Readers of this answer are most likely in a situation where they don't particularly support Excel < 2007, but are sending raw UTF8 text to Excel, which is misinterpreting it and sprinkling your text with à and other similar Windows-1252 characters. Adding the UTF8 BOM is probably your best and quickest fix.

如果您在旧版Excel上遇到用户,的CSV,您可以通过导出UTF16而不是UTF8来解决此问题。 Excel 2000和2003将双击 - 正确打开它们。 (其他一些文本编辑器可能会遇到UTF16的问题,因此您可能必须仔细权衡您的选项。)

If you are stuck with users on older Excels, and Excel is the only consumer of your CSVs, you can work around this by exporting UTF16 instead of UTF8. Excel 2000 and 2003 will double-click-open these correctly. (Some other text editors can have issues with UTF16, so you may have to weigh your options carefully.)

*除非你不能,(至少)Excel 2011 for Mac的导入向导实际上并不总是与所有的编码工作,无论你说什么。 < / anecdotal-evidence> :)

这篇关于Microsoft Excel中的.csv文件中的变音符号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆