使用fputcsv()/fgetcsv()写入csv时,数据出现乱码 [英] Data gets garbled when writing to csv with fputcsv() / fgetcsv()
问题描述
fputcsv()和fgetcsv()在PHP中似乎存在编码问题或错误.
以下PHP代码:
$row_before = ['A', json_encode(['a', '\\', 'b']), 'B'];
print "\nBEFORE:\n";
var_export($row_before);
print "\n";
$fh = fopen($file = 'php://temp', 'rb+');
fputcsv($fh, $row_before);
rewind($fh);
$row_after = fgetcsv($fh);
print "\nAFTER:\n";
var_export($row_after);
print "\n\n";
fclose($fh);
给我这个输出:
BEFORE:
array (
0 => 'A',
1 => '["a","\\\\","b"]',
2 => 'B',
)
AFTER:
array (
0 => 'A',
1 => '["a","\\\\',
2 => 'b""]"',
3 => 'B',
)
很明显,数据在途中被损坏.最初,该行中只有3个单元格,之后则是4个单元格.由于反斜线也用作转义字符,因此中间单元被拆分了.
另请参阅 https://3v4l.org/nc1oE 或在此处,使用明确的定界符,附件,escape_char值: https://3v4l.org/Svt7m >
在写入CSV之前,有什么方法可以清理/转义数据,以确保从文件读取的数据完全相同?
CSV是一种完全可逆的格式吗?
目标将是一种以csv形式正确写入和读取ANY数据的机制,以便在一次往返之后数据仍然相同.
我意识到我不太了解$ escape_char参数.另请参见 fgetcsv/fputcsv $ escape参数从根本上被破坏也许对此的答案也使我们更接近解决方案.
罪魁祸首是fputcsv()使用转义字符,这是CSV的非标准扩展. (嗯,就RFC 7111而言,它可以视为标准.)基本上,必须禁用此转义字符,但是将空字符串作为$ escape传递给fputcsv()无效.通常,传递NUL字符应该可以达到预期的效果,但是,请参见 https://3v4l.org/MlluN
There seems to be an encoding issue or bug in PHP with fputcsv() and fgetcsv().
The following PHP code:
$row_before = ['A', json_encode(['a', '\\', 'b']), 'B'];
print "\nBEFORE:\n";
var_export($row_before);
print "\n";
$fh = fopen($file = 'php://temp', 'rb+');
fputcsv($fh, $row_before);
rewind($fh);
$row_after = fgetcsv($fh);
print "\nAFTER:\n";
var_export($row_after);
print "\n\n";
fclose($fh);
Gives me this output:
BEFORE:
array (
0 => 'A',
1 => '["a","\\\\","b"]',
2 => 'B',
)
AFTER:
array (
0 => 'A',
1 => '["a","\\\\',
2 => 'b""]"',
3 => 'B',
)
So clearly, the data is damaged on the way. Originally there were just 3 cells in the row, afterwards there are 4 cells in the row. The middle cell is split thanks to the backslash that is also used as an escape character.
See also https://3v4l.org/nc1oE Or here, with explicit values for delimiter, enclosure, escape_char: https://3v4l.org/Svt7m
Is there any way I can sanitize / escape my data before writing to CSV, to guarantee that the data read from the file will be exactly the same?
Is CSV a fully reversible format?
EDIT: The goal would be a mechanism to properly write and read ANY data as csv, so that after one round trip the data is still the same.
EDIT: I realize that I do not really understand the $escape_char parameter. See also fgetcsv/fputcsv $escape parameter fundamentally broken Maybe an answer to this would also bring us closer to a solution.
The culprit is that fputcsv() uses an escape character, which is a non-standard extension to CSV. (Well, as far as RFC 7111 can be regarded as standard.) Basically, this escape character would have to be disabled, but passing an empty string as $escape to fputcsv() doesn't work. Usually, passing a NUL character should give the desired results, however, see https://3v4l.org/MlluN.
这篇关于使用fputcsv()/fgetcsv()写入csv时,数据出现乱码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!