检查字符串是否被编码为UTF-8 [英] Check to see if a string is encoded as UTF-8

查看：185 发布时间：2017/8/17 0:44:45 php string encoding utf-8

本文介绍了检查字符串是否被编码为UTF-8的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  function seem_utf8（$ str）{
 $ length = strlen（$ str）; （$ i = 0; $ i <$ length; $ i ++）
 {
 $ c = ord（$ str [$ i]）; 
 if（$ c <0x80）$ n = 0; ＃0bbbbbbb 
 elseif（（$ c& 0xE0）== 0xC0）$ n = 1; ＃110bbbbb 
 elseif（（$ c& 0xF0）== 0xE0）$ n = 2; ＃1110bbbb 
 elseif（（$ c& 0xF8）== 0xF0）$ n = 3; ＃11110bbb 
 elseif（（$ c& 0xFC）== 0xF8）$ n = 4; ＃111110bb 
 elseif（（$ c& 0xFE）== 0xFC）$ n = 5; ＃1111110b 
 else return false; ＃不匹配任何模型
 for（$ j = 0; $ j <$ n; $ j ++）{＃n个匹配10bbbbbb的字节跟随？ 
 if（（++ $ i == $ length）||（（ord（$ str [$ i]）& 0xC0）！= 0x80））
 return false; 
} 
} 
返回true; 
}

我从Wordpress得到这个代码，我不太了解这个，但我想知道这个功能究竟是什么。

如果有任何一个人知道请帮助我吗？

我需要关于上述代码的清晰想法。如果逐行解释会更有帮助。

解决方案

我用两种方法来检查字符串是否是utf-8（取决于情况）：

  mb_internal_encoding（'UTF-8'）; //总是需要在mb_函数之前，检查下面的笔记
 if（mb_strlen（$ string）！= strlen（$ string））{
 ///不是单字节
}

- 或 -

  if（preg_match（'！\S！u'，$ string））{
 // utf8 
} 
   
 对于mb_internal_encoding  - 由于我在php（5.3版本（5.3没有测试）中的一些未知的），将编码作为参数传递给mb_函数在使用mb_函数之前，不需要设置内部编码。
 
function seems_utf8($str) {
 $length = strlen($str);
 for ($i=0; $i < $length; $i++) {
  $c = ord($str[$i]);
  if ($c < 0x80) $n = 0; # 0bbbbbbb
  elseif (($c & 0xE0) == 0xC0) $n=1; # 110bbbbb
  elseif (($c & 0xF0) == 0xE0) $n=2; # 1110bbbb
  elseif (($c & 0xF8) == 0xF0) $n=3; # 11110bbb
  elseif (($c & 0xFC) == 0xF8) $n=4; # 111110bb
  elseif (($c & 0xFE) == 0xFC) $n=5; # 1111110b
  else return false; # Does not match any model
  for ($j=0; $j<$n; $j++) { # n bytes matching 10bbbbbb follow ?
   if ((++$i == $length) || ((ord($str[$i]) & 0xC0) != 0x80))
    return false;
  }
 }
 return true;
}
I got this code from Wordpress, I don't know much about this, but I would like to know what exactly happing in that function. 

If any one know please help me out?

I need the clear idea about the above code. If line by line explanation will be more helpful.
 解决方案 
I use two ways to check if string is utf-8 (depending on the case):
mb_internal_encoding('UTF-8'); // always needed before mb_ functions, check note below
if (mb_strlen($string) != strlen($string)) {
 /// not single byte
}
-- OR -- 
if (preg_match('!\S!u', $string)) {
 // utf8
}
For the mb_internal_encoding - due to some unknown to me bug in php (version 5.3- (haven't tested it on 5.3)) passing the encoding as a parameter to the mb_ function doesn't work and the internal encoding needs to be set before any use of mb_ functions.

                        这篇关于检查字符串是否被编码为UTF-8的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

检查字符串是否被编码为UTF-8 [英] Check to see if a string is encoded as UTF-8

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

检查字符串是否被编码为UTF-8 [英] Check to see if a string is encoded as UTF-8

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭