CSV:如何包含双字节字符 [英] CSV: how to include double byte characters

查看:106
本文介绍了CSV:如何包含双字节字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

必须生成一个包含双字节字符(中文,日文)的CSV文件,该CSV文件将打开,并且在使用文本编辑器时可以正确读取文本.

Have to generate a CSV file that includes double byte characters (Chinese, Japanese), the CSV file opens and the text reads correctly when use a text editor.

但是在Excel中打开时,生成的CSV文件将显示垃圾文本,我错过了什么?

but the generated CSV file will show garbage text when opened in Excel, what did I miss?

推荐答案

不幸的是,您不会错过任何东西.如果您仅使用Excel打开,则Microsoft无法正确处理具有Unicode的CSV文件.

Unfortunately you don't miss something. It is Microsoft which is not able handling CSV files with unicode properly if you simply opening them with Excel.

如果Excel保存CSV文件,则它使用 Unicode编码,但默认情况下使用依赖于Office语言版本的其他ISO编码.尽管unicode并非默认值,尽管它已成为21世纪的标准,而且在用Excel保存CSV的同时甚至无法使用unicode.唯一可以保存unicode的文件格式是Unicode Text (*.txt).但这是制表符分隔的文本格式,而不是CSV.

If Excel saves CSV files, it uses not unicode encoding but per default other ISO encodings dependent of the Office language version. Not only that unicode is not the default, although is stand of the art in 21 century, it is furthermore not even possible to use unicode while saving CSV with Excel. The only file format which can save unicode is Unicode Text (*.txt). But this is a tabulator delimited text format instead of CSV.

同样,如果Excel正在打开CSV文件,则它会采用Unicode.相反,它将采用与保存CSV时将使用的默认编码相同的默认编码.这就是如果CSV中存在 Unicode的情况下会出现垃圾字符的原因.

So also if Excel is opening CSV files, it will not assume unicode in it. Instead it will assume the same default encoding it would use while saving CSV. Thats why the garbage characters occur if there is unicode in the CSV.

有一个例外.如果CSV使用UTF-8编码,则存在UTF-8 BOM 在文件的开头,分隔符是默认的分隔符,然后 Excel 可以正确打开此CSV.

There is one exception. If the CSV is UTF-8 encoded and there is a UTF-8 BOM at the beginning of the file and the delimiter is the default delimiter, then Excel can open this CSV properly.

但是还有一个

But there is also a Text Import Wizard. If you are using this, you can determine the encoding in step 1 with File origin. 65001 : Unicode (UTF-8) will be UTF-8. This wizard should be able to import all CSV files properly.

这篇关于CSV:如何包含双字节字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆