在PHP中使用文件和utf8 [英] Working with files and utf8 in PHP

查看:132
本文介绍了在PHP中使用文件和utf8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  aoeu 
qjkx $ b $可以说我有一个名为foo.txt的文件bñpy

我想得到一个包含该文件中所有行的数组(每行一行索引),其中包含字母aoeuñpyf,只有这些字母的行。

我写了下面的代码(也编码为utf8):

  $ allowed_letters = array(a,o,e,u, - ,p,y, F); 

$ lines = array();
$ f = fopen(foo.txt,r);
while(!feof($ f)){
$ line = fgets($ f);
foreach(preg_split(//,$ line,-1,PREG_SPLIT_NO_EMPTY)as $ letter){
if(!in_array($ letter,$ allowed_letters)){
$ line = ;


if($ line!=){
$ lines [] = $ line;
}
}
fclose($ f);

然而, $ lines 数组只是有它的aoeu线。

这似乎是因为不知何故,在code$ allowed_letters 中的ñ是不一样的 ñin foo.txt。

同样,如果我打印一个文件ñ,会出现一个问号,但是如果我这样打印 printñ; ,它的工作原理。

如何让它工作?

解决方案

你正在运行Windows,操作系统不保存文件在UTF-8,但在cp1251(或其他...)默认情况下,您需要以该格式显式保存该文件或运行 utf8_encode ()在执行检查之前。 I $。:

$ $ $ $ $ $ $ $ $ $ $ = $ utf8_encode(fgets($ f));

如果您确定该文件是UTF-8编码,那么您的PHP文件也是UTF-8如果一切都是UTF-8,那么这就是你所需要的:

  
>

code> foreach(preg_split(// u,$ line,-1,PREG_SPLIT_NO_EMPTY)as $ letter){
// ...
}

(为Unicode码字符追加 u 然而,让我建议一个更快的方式来执行您的检查:

  $ allowed_letters =阵列( 一, O, E, U, N, p, Y, F); 

$ lines = array();
$ f = fopen(foo.txt,r);
while(!feof($ f)){
$ line = fgets($ f);

$ line = str_split(rtrim($ line));
if(count(array_intersect($ line,$ allowed_letters))== count($ line)){
$ lines [] = $ line;
}
}
fclose($ f);

(添加空格字符以允许空格字符,并删除 rtrim ($ line)


Lets say I have a file called foo.txt encoded in utf8:

aoeu  
qjkx
ñpyf

And I want to get an array that contains all the lines in that file (one line per index) that have the letters aoeuñpyf, and only the lines with these letters.

I wrote the following code (also encoded as utf8):

$allowed_letters=array("a","o","e","u","ñ","p","y","f");

$lines=array();
$f=fopen("foo.txt","r");
while(!feof($f)){
    $line=fgets($f);
    foreach(preg_split("//",$line,-1,PREG_SPLIT_NO_EMPTY) as $letter){
        if(!in_array($letter,$allowed_letters)){
            $line="";
        }
    }
    if($line!=""){
        $lines[]=$line;
    }
}
fclose($f);

However, after that, the $lines array just has the aoeu line in it.
This seems to be because somehow, the "ñ" in $allowed_letters is not the same as the "ñ" in foo.txt.
Also if I print a "ñ" of the file, a question mark appears, but if I print it like this print "ñ";, it works.
How can I make it work?

解决方案

If you are running Windows, the OS does not save files in UTF-8, but in cp1251 (or something...) by default you need to save the file in that format explicitly or run each line in utf8_encode() before performing your check. I.e.:

$line=utf8_encode(fgets($f));

If you are sure that the file is UTF-8 encoded, is your PHP file also UTF-8 encoded?

If everything is UTF-8, then this is what you need :

foreach(preg_split("//u",$line,-1,PREG_SPLIT_NO_EMPTY) as $letter){
   // ...
}

(append u for unicode chars)

However, let me suggest a yet faster way to perform your check :

$allowed_letters=array("a","o","e","u","ñ","p","y","f");

$lines=array();
$f=fopen("foo.txt","r");
while(!feof($f)){
    $line=fgets($f);

    $line = str_split(rtrim($line));
    if (count(array_intersect($line, $allowed_letters)) == count($line)) {
            $lines[] = $line;
    }
}
fclose($f);

(add space chars to allow space characters as well, and remove the rtrim($line))

这篇关于在PHP中使用文件和utf8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆