用 PHP 中的换行符替换文字字符串 \r\n 的问题 [英] Problem Replacing Literal String \r\n With Line Break in PHP
问题描述
我有一个文本文件,其中包含文字字符串 \r\n
.我想用实际的换行符 (\n) 替换它.
I have a text file that has the literal string \r\n
in it. I want to replace this with an actual line break (\n).
我知道正则表达式 /\\r\\n/
应该匹配它(我已经在 Reggy),但我无法在 PHP 中使用它.
I know that the regex /\\r\\n/
should match it (I have tested it in Reggy), but I cannot get it to work in PHP.
我尝试了以下变体:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\\[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
如果我只是尝试替换反斜杠,它可以正常工作.一旦我添加了 r,它就找不到匹配项.
If I just try to replace the backslash, it works properly. As soon as I add an r, it finds no matches.
我正在阅读的文件被编码为 UTF-16.
The file I am reading is encoded as UTF-16.
我也已经尝试过使用 str_replace()
.
I have also already tried using str_replace()
.
我现在认为这里的问题是文件的字符编码.我尝试了以下方法,它确实有效:
I now believe that the problem here is the character encoding of the file. I tried the following, and it did work:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
但它不适用于我从文件中读取的行.
but it does not work on lines I am reading in from my file.
推荐答案
UTF-16 是问题所在.如果您只是使用原始字节,则可以使用完整序列进行替换:
UTF-16 is the problem. If you're just working with raw the bytes, then you can use the full sequences for replacing:
$out = str_replace("\x00\x5c\x00\x72\x00\x5c\x00\x6e", "\x00\x0a", $in);
这假定大端 UTF-16,否则将零字节交换到非零之后:
This assumes big-endian UTF-16, else swap the zero bytes to come after the non zeros:
$out = str_replace("\x5c\x00\x72\x00\x5c\x00\x6e\x00", "\x0a\x00", $in);
如果这不起作用,请发布您输入文件的字节转储,以便我们查看它实际包含的内容.
If that doesn't work, please post a byte-dump of your input file so we can see what it actually contains.
这篇关于用 PHP 中的换行符替换文字字符串 \r\n 的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!