在PHP中编写UTF-8编码文件的问题 [英] Problem writing UTF-8 encoded file in PHP

查看:157
本文介绍了在PHP中编写UTF-8编码文件的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大文件,其中包含世界各国/地区,我根据个别国家/地区划分为较小的文件。原始文件包含以下条目:

  EE.04Järvamaa
EE.05Jõgevamaa
EE.07然而,当我提取它并将其写入一个新文件时,文本变为:

>

  EE.04Järvamaa
EE.05JÃμgevamaa
EE.07Lään¤ b

要保存文件,我使用以下代码:

  mb_detect_encoding($ text,UTF-8)==UTF-8? :$ text = utf8_encode($ text); 
$ fp = fopen(MY_LOCATION,'wb');
fwrite($ fp,$ text);
fclose($ fp);

我尝试保存文件有和没有utf8_encode()和似乎都不工作。我将如何保存原始编码(即UTF8)?



谢谢!

解决方案

首先,不要依赖 mb_detect_encoding 。除非有一堆编码特定实体(意味着在其他编码中无效的实体),否则找出该编码是不是很好。



尝试摆脱 mb_detect_encoding 一起。



哦,和 utf8_encode 打开 Latin-1 字符串转换为 UTF-8 字符串(不是从任意字符集到 UTF-8 这是你真正想要的)...你想 iconv ,但您需要知道源代码(因为您不能真正信任 mb_detect_encoding ,您需要找出一些其他方式)。



或者您可以尝试使用 iconv 输入空输入编码 $ str = iconv '','UTF-8',$ str); (可能或可能不工作)...


I have a large file that contains world countries/regions that I'm seperating into smaller files based on individual countries/regions. The original file contains entries like:

  EE.04 Järvamaa
  EE.05 Jõgevamaa
  EE.07 Läänemaa

However when I extract that and write it to a new file, the text becomes:

  EE.04  Järvamaa
  EE.05  Jõgevamaa
  EE.07  Läänemaa

To save my files I'm using the following code:

mb_detect_encoding($text, "UTF-8") == "UTF-8" ? : $text = utf8_encode($text);
$fp = fopen(MY_LOCATION,'wb');
fwrite($fp,$text);
fclose($fp);

I tried saving the files with and without utf8_encode() and neither seems to work. How would I go about saving the original encoding (which is UTF8)?

Thank you!

解决方案

First off, don't depend on mb_detect_encoding. It's not great at figuring out what the encoding is unless there's a bunch of encoding specific entities (meaning entities that are invalid in other encodings).

Try just getting rid of the mb_detect_encoding line all together.

Oh, and utf8_encode turns a Latin-1 string into a UTF-8 string (not from an arbitrary charset to UTF-8, which is what you really want)... You want iconv, but you need to know the source encoding (and since you can't really trust mb_detect_encoding, you'll need to figure it out some other way).

Or you can try using iconv with a empty input encoding $str = iconv('', 'UTF-8', $str); (which may or may not work)...

这篇关于在PHP中编写UTF-8编码文件的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆