比较PHP中的两个Unicode字符串 [英] Comparing two unicode strings in PHP
问题描述
我在比较PHP中的两个unicode字符串,它们都包含特殊字符'ö'。一个字符串来自 $ _ GET
,另一个是文件系统的文件夹名称( scandir()
)。这两个字符串似乎等于我,使一个
I am stuck in comparing two unicode strings in PHP which both contain the special char 'ö'. One string comes from $_GET
, the other one is a filesystem's folder name (scandir()
). Both strings seem to be equal to me, making a
var_dump($filter);
var_dump($tail . '/' . $k);
也显示它们的等同性,但是不同 string lenghts ):
on them also shows their equality but with different string lenghts (?!):
string '/blöb' (length=7)
string '/blöb' (length=6)
我的片段比较如下:
if($filter == ($tail . '/' . $k)) {
/* ... */
}
这里发生了什么?
: $ tail
是一个空字符串:
string '' (length=0)
推荐答案
a href =http://en.wikipedia.org/wiki/Unicode_equivalence =nofollow> http://en.wikipedia.org/wiki/Unicode_equivalence ,并使用此方法: http://www.php.net/manual/en/class.normalizer.php
See here: http://en.wikipedia.org/wiki/Unicode_equivalence and use this: http://www.php.net/manual/en/class.normalizer.php
您可能在较长字符串中有一个分解字符,表示一个o,然后是一个包含上一个字符的变音符组合字符。
You probably have a decomposed character in the longer string, meaning an o and then a umlaut combining character which overlays the previous character.
正常化函数将修复这样的情况。
The normalizer function will fix things like that.
注意,如果你使用它等价,你应该总是规范化输入username - 你想确保两个人不选择相同的用户名,即使字符串的二进制表示是不同的)。
As a side note you should always normalize your input if you are using it for equivalence (for example a username - you want to make sure two people don't choose the same username, even if the binary representation of the string happens to be different).
这篇关于比较PHP中的两个Unicode字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!