PHP - 检测用户提供的字符的字符集 [英] PHP - detecting the user supplied character's char set

查看:122
本文介绍了PHP - 检测用户提供的字符的字符集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以检测用户的字符串的字符集?

Is it possible to detect the user's string's char set?

如果没有,下一个问题。

If not, how about the next question..

有没有可靠的内置PHP函数,可以准确地判断用户提供的字符串(通过get / post / cookie等提供)是否为UTF-8?换句话说,我可以做像

Are there reliable built-in PHP functions that can accurately tell if the user supplied string ( be it supplied thru get/post/cookie etc), are in a UTF-8 or not? In other words, can I do something like

is_utf8($ _ GET ['first_name'])

is_utf8($_GET['first_name'])

有反正这个函数可以产生一个TRUE,在现实中first_name不是在UTF-8?

Is there anyway this function could produce a TRUE where in reality the first_name was not in UTF-8?

推荐答案

关于1:

http://php.net/mb_check_encoding =nofollow> mb_detect_encoding 尝试,但它在黑暗中几乎是一个镜头。 编码字符串只是一串字节。这种字节序列在任何数量的不同编码中通常同样有效。因此,根据定义,不可能检测到可靠的未知编码,您只能猜测。为此,存在诸如HTTP报头的元信息,其应当传达所传送的内容的编码。

You can give mb_detect_encoding a try, but it's pretty much a shot in the dark. An "encoded" string is just a bunch of bytes. Such byte sequences are often equally valid in any number of different encodings. It's therefore by definition not possible to detect an unknown encoding reliably, you can only guess. For this reason there exist meta information such as HTTP headers which should communicate the encoding of the transferred content. Check those if available.

关于2:

mb_check_encoding($ var,'UTF-8') 会告诉你字符串是否是有效的UTF- 。据我所见,在最近的PHP版本中,它做了它在锡上说的。这仍然不意味着字符串必然是一个UTF-8字符串,它只是意味着字节序列是在UTF-8有效的顺序。

mb_check_encoding($var, 'UTF-8') will tell you whether the string is a valid UTF-8 string. As far as I've seen, in recent versions of PHP it does what it says on the tin. That still doesn't mean the string is necessarily really a UTF-8 string, it just means the byte sequence is in an order that is valid in UTF-8.

这篇关于PHP - 检测用户提供的字符的字符集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆