用PHP中的UTF-16BE编码保存CSV [英] saving CSV with UTF-16BE encoding in PHP
问题描述
我正在尝试从用UTF-8编码的MySQL数据库编写一个字符编码设置为UTF-16BE的CSV文件。
I am trying to write a CSV file with a character encoding set to UTF-16BE from a MySQL database encoded in UTF-8.
我的代码是:
$f = fopen('file.csv', 'w');
$firstLineKeys = false;
// UTF-16BE BOM
fwrite($f, chr(254) . chr(255));
foreach ($lines as $line)
{
$lineEncoded = [];
foreach ($line as $key => $value)
{
$key = mb_convert_encoding($key, 'UTF-16BE', "auto");
$value = mb_convert_encoding($value, 'UTF-16BE', "auto");
$lineEncoded[$key] = $value;
}
if (empty($firstLineKeys))
{
$firstLineKeys = array_keys($lineEncoded);
fputcsv($f, $firstLineKeys);
$firstLineKeys = array_flip($firstLineKeys);
}
fputcsv($f, array_merge($firstLineKeys, $lineEncoded));
}
fclose($f);
在OpenOffice中打开文件时,它尝试使用Unicode字符集导入该文件,但字段真是一团糟...当我将导入字符集切换为UTF-8时,它看起来是正确的。
When I open the file in OpenOffice it try's to import it with a character set of Unicode but the fields are a mess... when I switch the import character set to UTF-8 it looks correct.
感谢任何帮助
推荐答案
$key = mb_convert_encoding($key, 'UTF-16BE', "auto");
(您确定要成为吗?这是一种很少使用的编码。Windows Unicode是UTF-16LE。)
(Are you sure you want BE? It's a pretty rarely-used encoding. Windows "Unicode" is UTF-16LE.)
我会避免使用 auto
作为from_encoding。这是一个不可靠的界限,通常会产生错误的结果,尤其是在短字符串上。由于输入内容显然是UTF-8,因此您应该明确声明。
I would avoid using "auto"
as the from_encoding. It's an unreliable bodge that will often produce the wrong results especially on short strings. As the input is apparently UTF-8 you should state that explicitly instead.
fputcsv($f, array_merge($firstLineKeys, $lineEncoded));
不幸的是, fputcsv
无法写入UTF-16编码的文件。它使用单字节ASCII逗号/引号/换行符,因此仅适用于ASCII超集的编码。因此,如果要使用它,则必须将整个文件编写为UTF-8,然后将整个文件转码为UTF-16。
Unfortunately fputcsv
can't write to a UTF-16-encoded file. It uses single-byte ASCII commas/quotes/newlines so it only works for encodings that are ASCII supersets. So if you wanted to use it you would have to write the whole file as UTF-8, and then transcode the whole file to UTF-16.
您可能想要考虑使用其他(或您自己)的CSV编写器; fputcsv
不仅令人讨厌使用非ASCII编码,而且也不符合CSV文件的RFC标准,因此您可以轻松生成最消耗CSV的文件软件无法正确加载。
You might want to consider a different (or your own) CSV writer instead; as well as being annoying to use for non-ASCII encodings, fputcsv
also doesn't comply with the RFC standard for CSV files, so you can easily generate files most CSV-consuming software can't load properly.
PHP的内置CSV函数实际上完全浪费了每个人的时间。
PHP's in-built CSV functions are essentially a complete waste of everyone's time.
这篇关于用PHP中的UTF-16BE编码保存CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!