在PHP中使用文件和utf8 [英] Working with files and utf8 in PHP
问题描述
aoeu
qjkx $ b $可以说我有一个名为foo.txt的文件bñpy
我想得到一个包含该文件中所有行的数组(每行一行索引),其中包含字母aoeuñpyf,只有这些字母的行。
我写了下面的代码(也编码为utf8):
$ allowed_letters = array(a,o,e,u, - ,p,y, F);
$ lines = array();
$ f = fopen(foo.txt,r);
while(!feof($ f)){
$ line = fgets($ f);
foreach(preg_split(//,$ line,-1,PREG_SPLIT_NO_EMPTY)as $ letter){
if(!in_array($ letter,$ allowed_letters)){
$ line = ;
if($ line!=){
$ lines [] = $ line;
}
}
fclose($ f);
然而, $ lines
数组只是有它的aoeu线。
这似乎是因为不知何故,在code$ allowed_letters 中的ñ是不一样的 ñin foo.txt。
同样,如果我打印一个文件ñ,会出现一个问号,但是如果我这样打印 printñ;
,它的工作原理。
如何让它工作?
你正在运行Windows,操作系统不保存文件在UTF-8,但在cp1251(或其他...)默认情况下,您需要以该格式显式保存该文件或运行 utf8_encode ()
在执行检查之前。 I $。:
$ $ $ $ $ $ $ $ $ $ $ = $ utf8_encode(fgets($ f));
如果您确定该文件是UTF-8编码,那么您的PHP文件也是UTF-8如果一切都是UTF-8,那么这就是你所需要的:
> code> foreach(preg_split(// u,$ line,-1,PREG_SPLIT_NO_EMPTY)as $ letter){
// ...
}
(为Unicode码字符追加
u
)然而,让我建议一个更快的方式来执行您的检查:
$ allowed_letters =阵列( 一, O, E, U, N, p, Y, F);
$ lines = array();
$ f = fopen(foo.txt,r);
while(!feof($ f)){
$ line = fgets($ f);
$ line = str_split(rtrim($ line));
if(count(array_intersect($ line,$ allowed_letters))== count($ line)){
$ lines [] = $ line;
}
}
fclose($ f);
(添加空格字符以允许空格字符,并删除
rtrim ($ line)
)Lets say I have a file called foo.txt encoded in utf8:
aoeu qjkx ñpyf
And I want to get an array that contains all the lines in that file (one line per index) that have the letters aoeuñpyf, and only the lines with these letters.
I wrote the following code (also encoded as utf8):
$allowed_letters=array("a","o","e","u","ñ","p","y","f"); $lines=array(); $f=fopen("foo.txt","r"); while(!feof($f)){ $line=fgets($f); foreach(preg_split("//",$line,-1,PREG_SPLIT_NO_EMPTY) as $letter){ if(!in_array($letter,$allowed_letters)){ $line=""; } } if($line!=""){ $lines[]=$line; } } fclose($f);
However, after that, the
$lines
array just has the aoeu line in it.
This seems to be because somehow, the "ñ" in$allowed_letters
is not the same as the "ñ" in foo.txt.
Also if I print a "ñ" of the file, a question mark appears, but if I print it like thisprint "ñ";
, it works.
How can I make it work?解决方案If you are running Windows, the OS does not save files in UTF-8, but in cp1251 (or something...) by default you need to save the file in that format explicitly or run each line in
utf8_encode()
before performing your check. I.e.:$line=utf8_encode(fgets($f));
If you are sure that the file is UTF-8 encoded, is your PHP file also UTF-8 encoded?
If everything is UTF-8, then this is what you need :
foreach(preg_split("//u",$line,-1,PREG_SPLIT_NO_EMPTY) as $letter){ // ... }
(append
u
for unicode chars)However, let me suggest a yet faster way to perform your check :
$allowed_letters=array("a","o","e","u","ñ","p","y","f"); $lines=array(); $f=fopen("foo.txt","r"); while(!feof($f)){ $line=fgets($f); $line = str_split(rtrim($line)); if (count(array_intersect($line, $allowed_letters)) == count($line)) { $lines[] = $line; } } fclose($f);
(add space chars to allow space characters as well, and remove the
rtrim($line)
)这篇关于在PHP中使用文件和utf8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!