utf8_encode函数的用途 [英] utf8_encode function purpose

查看:153
本文介绍了utf8_encode函数的用途的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设使用UTF-8编码我的文件。

Supposed that im encoding my files with UTF-8.

在PHP脚本中,将比较一个字符串:

Within PHP script, a string will be compared:

$string="ぁ";
$string = utf8_encode($string); //Do i need this step?
if(preg_match('/ぁ/u',$string))
//Do if match...

string 真的是UTF-8没有utf8_encode()函数?
如果你用UTF-8编码你的文件,不需要这个功能?

Its that string really UTF-8 without the utf8_encode() function? If you encode your files with UTF-8 dont need this function?

推荐答案

utf8_encode ,它会将ISO-8859-1编码的字符串转换为UTF -8 。函数名称是一个可怕的misnomer,因为它建议某种自动编码是必要的。情况并非如此。如果您的源代码保存为UTF-8,并将あ分配给 $ string ,则 $ string 以UTF-8编码的字符あ。无需进一步操作。事实上,尝试将UTF-8字符串(不正确地)从ISO-8859-1转换为UTF-8将会使它乱码。

If you read the manual entry for utf8_encode, it converts an ISO-8859-1 encoded string to UTF-8. The function name is a horrible misnomer, as it suggests some sort of automagic encoding that is necessary. That is not the case. If your source code is saved as UTF-8 and you assign "あ" to $string, then $string holds the character "あ" encoded in UTF-8. No further action is necessary. In fact, trying to convert the UTF-8 string (incorrectly) from ISO-8859-1 to UTF-8 will garble it.

源代码作为字节序列被读取。 PHP解释对它重要的东西(所有的关键字和运算符等)在ASCII。 UTF-8向后兼容ASCII。这意味着,所有正常ASCII字符在ASCII和UTF-8中使用相同的字节表示。所以一个被PHP解释为一个,无论它应该保存为ASCII还是UTF-8 。任何引号之间,PHP只是作为文字位序列。所以PHP看到你的11100011 10000001 10000010。它不关心报价之间到底是什么,它只是使用它是原样。

To elaborate a little more, your source code is read as a byte sequence. PHP interprets the stuff that is important to it (all the keywords and operators and so on) in ASCII. UTF-8 is backwards compatible to ASCII. That means, all the "normal" ASCII characters are represented using the same byte in both ASCII and UTF-8. So a " is interpreted as a " by PHP regardless of whether it's supposed to be saved in ASCII or UTF-8. Anything between quotes, PHP simply takes as the literal bit sequence. So PHP sees your "あ" as "11100011 10000001 10000010". It doesn't care what exactly is between the quotes, it'll just use it as-is.

这篇关于utf8_encode函数的用途的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆