确定UTF-8文本是否全部为ASCII? [英] Determine if UTF-8 text is all ASCII?

查看:148
本文介绍了确定UTF-8文本是否全部为ASCII?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

解决方案

在PHP中,最简单的方法是确定某些给定的UTF-8文本是否为纯ASCII >一个可能更快的函数将是使用一个负面的字符类(因为正则表达式可以停止,当它命中第一个字符,并且不需要内部捕获任何东西):

 函数isAscii($ str){
return 0 == preg_match('/ [^ \x00-\x7F] /',$ str);
}

没有正则表达式(基于我的评论){

 函数isAscii($ str){
$ len = strlen($ str){
for($ i = 0; $ i < $ len; $ i ++){
if(ord($ str [$ i])> 127)return false;
}
返回true;
}

但我不得不问,你为什么这么关心更快?使用更易于阅读和更容易理解的版本,只有当您知道 时,才担心优化它。



编辑



然后最快的可能是 mb_check_encoding

 code> function isAscii($ str){
return mb_check_encoding($ str,'ASCII');
}


What's the fastest way, in PHP, to determine if some given UTF-8 text is purely ASCII or not?

解决方案

A possibly faster function would be to use a negative character class (since the regex can just stop when it hits the first character, and there's no need to internally capture anything):

function isAscii($str) {
    return 0 == preg_match('/[^\x00-\x7F]/', $str);
}

Without regex (based on my comment) {

function isAscii($str) {
    $len = strlen($str) {
    for ($i = 0; $i < $len; $i++) {
        if (ord($str[$i]) > 127) return false;
    }
    return true;
}

But I'd have to ask, why are you so concerned about faster? Use the more readable and easier to understand version, and only worry about optimizing it when you know it's a problem...

Edit:

Then the fastest will likely be mb_check_encoding:

function isAscii($str) {
    return mb_check_encoding($str, 'ASCII');
}

这篇关于确定UTF-8文本是否全部为ASCII?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆