检查PHP中是否为多字节字符串 [英] check if is multibyte string in PHP

查看:166
本文介绍了检查PHP中是否为多字节字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想检查PHP上的字符串类型是否为多字节. 有什么想法要实现吗?

I want to check if is a string type multibyte on PHP. Have any idea how to accomplish this?

示例:

<?php!
$string = "I dont have idea that is what i am...";
if( is_multibyte( $string ) )
{
    echo 'yes!!';
}else{
    echo 'ups!';
}
?>

也许(规则8个字节):

Maybe( rule 8 bytes ):

<?php
if( mb_strlen( $string ) > strlen() )
{
    return true;
}
else
{
    return false;
}
?>

我阅读: 可变宽度编码-WIKI UTF-8-维基百科

I read: Variable width encoding - WIKI and UTF-8 - WIKI

推荐答案

有两种解释.首先是每个字符都是多字节.第二个是该字符串至少包含一个多字节字符.如果您有兴趣处理无效的字节序列,请参见 https://stackoverflow.com/a/13695364/531320 详细信息.

There are two interpretations. The first is that every character is multibyte. The second is that the string contains one multibyte character at least. If you have an interest for handling invalid byte sequence, see https://stackoverflow.com/a/13695364/531320 for details.

function is_all_multibyte($string)
{
    // check if the string doesn't contain invalid byte sequence
    if (mb_check_encoding($string, 'UTF-8') === false) return false;

    $length = mb_strlen($string, 'UTF-8');

    for ($i = 0; $i < $length; $i += 1) {

        $char = mb_substr($string, $i, 1, 'UTF-8');

        // check if the string doesn't contain single character
        if (mb_check_encoding($char, 'ASCII')) {

            return false;

        }

    }

    return true;

}

function contains_any_multibyte($string)
{
    return !mb_check_encoding($string, 'ASCII') && mb_check_encoding($string, 'UTF-8');
}

$data = ['東京', 'Tokyo', '東京(Tokyo)'];

var_dump(
    [true, false, false] ===
    array_map(function($v) {
        return is_all_multibyte($v);
    },
    $data),
    [true, false, true] ===
    array_map(function($v) {
        return contains_any_multibyte($v);
    },
    $data)
);

这篇关于检查PHP中是否为多字节字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆