是否可以将strpos与UTF-8字符串一起使用? [英] Safe to use strpos with UTF-8 strings?

查看：174 发布时间：2020/7/13 3:41:09 php string utf-8

本文介绍了是否可以将strpos与UTF-8字符串一起使用?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一堆具有不同字符集的字符串. $charset变量包含当前字符串的字符集.

I have a bunch of strings with different charsets. The $charset variable contains the charset of the current string.

$content = iconv($charset, 'UTF-8', $content);

完成此操作后，是否可以安全地使用strpos，strlen，substr等，而不使用等效的多字节格式?我之所以这样问是因为我也经常使用preg_match.因此，如果我使用PREG_OFFSET_CAPTURE来获取单词在字符串中的位置，则无法使用该值与mb_substr来删除单词之前的所有内容.

With this done, is it safe to use strpos, strlen, substr etcetera and not their multibyte equivalent? I'm asking this because I use preg_match a lot as well. So if I use PREG_OFFSET_CAPTURE to get the position of a word in the string I can't use that value with mb_substr to remove everything before the word.

推荐答案

这完全取决于您要执行的操作.核心strlen和类似功能可在 bytes 上工作.他们接受并返回的每个数字都是字节计数或字节偏移量. mb_ *函数可在字符上识别编码.他们接受并返回的所有数字都是字符计数或偏移量.

That entirely depends on what you want to do. The core strlen and similar functions work on bytes. Every number they accept and return is a byte count or byte offset. The mb_* functions work encoding-aware on characters. All numbers they accept and return are character counts or offsets.

如果您有一种安全的方式来获取字符串中的字节偏移量(安全"表示该偏移量不在多字节字符的中间)，然后例如使用，就可以了.例如:

If you have a safe way of getting a byte offset in a string ("safe" meaning the offset is not in the middle of a multi-byte character) and then, for example, crop everything before that offset using substr, that'll work just fine. For instance:

$str     = '漢字';
$offset  = strpos($str, '字');
$cropped = substr($str, $offset);

工作正常.

但是，这行不通:

$cropped = substr($str, $offset, 1);

您不能安全地切出一个 byte 而不冒切成多字节字符的风险.

You can't safely cut out a single byte without running the risk of cutting into a multi-byte character.

这篇关于是否可以将strpos与UTF-8字符串一起使用?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

是否可以将strpos与UTF-8字符串一起使用? [英] Safe to use strpos with UTF-8 strings?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

是否可以将strpos与UTF-8字符串一起使用? [英] Safe to use strpos with UTF-8 strings?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭