PHP UTF-8问题-如果我在PHP中创建一个字符串...在UTF-8中是吗? [英] PHP UTF-8 questions - If I create a string in PHP... is it in UTF-8?

查看:71
本文介绍了PHP UTF-8问题-如果我在PHP中创建一个字符串...在UTF-8中是吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在PHP中,如果我创建这样的字符串:

$str = "bla bla here is my string";

然后我将能够使用mbstring函数以UTF8的形式在该字符串上进行操作吗?

// Will this work?
$str = mb_strlen($str); 

此外,如果我有另一个我知道为UTF-8的字符串(例如,这是一个POST表单值,或者来自数据库的UTF-8字符串),那么我可以将它们串联起来吗?两个又没有什么问题?

// What about this, will this work? 
$str = $str . $utf8_string_from_database;

解决方案

第一个问题:这取决于字符串中的确切位置.

在PHP(无论如何最多为PHP5)中,字符串只是字节序列.没有与之关联的隐式或显式字符集;这是程序员必须跟踪的东西.因此,如果仅在引号之间放置有效的UTF-8字节(如果文件本身编码为UTF-8,则非常容易),则该字符串将为UTF-8,并且可以在其上安全地使用mb_strlen()./p>

此外,如果您使用的是mbstring函数,则需要使用

Will I then be able to use the mbstring functions to operate on that string as UTF8?

// Will this work?
$str = mb_strlen($str); 

Further, if I then have another string that I know is UTF-8 (say it was a POSTed form value, or a UTF-8 string from a database), can I then concatenate these two and not have any problems?

// What about this, will this work? 
$str = $str . $utf8_string_from_database;

First question: it depends on what exactly goes in the string.

In PHP (up to PHP5, anyway), strings are just sequences of bytes. There is no implied or explicit character set associated with them; that's something the programmer must keep track of. So, if you only put valid UTF-8 bytes between the quotes (fairly easy if the file itself is encoded as UTF-8), then the string will be UTF-8, and you can safely use mb_strlen() on it.

Also, if you're using mbstring functions, you need to explicitly tell it what character set your string is, either with mbstring.internal_encoding or as the last argument to any mbstring function.

Second question: yes, with caveats.

Two strings that are both independently valid UTF-8 can be safely byte-wise concatenated (like with PHP's . operator) and still be valid UTF-8. However, you can never be sure, without doing some work yourself, that a POSTed string is valid UTF-8. Database strings are a little easier, if you carefully set the connection character set, because most DBMSs will do any conversion for you.

这篇关于PHP UTF-8问题-如果我在PHP中创建一个字符串...在UTF-8中是吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆