确定UTF-8文本是否全部是ASCII? [英] Determine if UTF-8 text is all ASCII?
问题描述
在PHP中,确定某些给定的UTF-8文本是否是纯ASCII文件,最快的方法是什么?
What's the fastest way, in PHP, to determine if some given UTF-8 text is purely ASCII or not?
推荐答案
一个可能更快的函数是使用一个负字符类(因为正则表达式只要在第一个字符匹配时就停止,并且不需要内部捕获任何东西):
A possibly faster function would be to use a negative character class (since the regex can just stop when it hits the first character, and there's no need to internally capture anything):
function isAscii($str) {
return 0 == preg_match('/[^\x00-\x7F]/', $str);
}
没有regex(基于我的评论){
Without regex (based on my comment) {
function isAscii($str) {
$len = strlen($str) {
for ($i = 0; $i < $len; $i++) {
if (ord($str[$i]) > 127) return false;
}
return true;
}
但我不得不问,你为什么这么关心更快?使用更易读且更易于理解的版本,只需担心在您知道发生问题时对其进行优化...
But I'd have to ask, why are you so concerned about faster? Use the more readable and easier to understand version, and only worry about optimizing it when you know it's a problem...
修改:
Edit:
那么最快的可能是 mb_check_encoding
:
Then the fastest will likely be mb_check_encoding
:
function isAscii($str) {
return mb_check_encoding($str, 'ASCII');
}
这篇关于确定UTF-8文本是否全部是ASCII?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!