如果default_charset为空,字符集是什么 [英] What is the character set if default_charset is empty
问题描述
从PHP 5.6开始, default_charset
字符串设置为 UTF-8
,例如, 在 php.ini $ c中$ c>文档
。它说早期版本的字符串是空的。
In PHP 5.6 onwards the default_charset
string is set to "UTF-8"
as explained e.g. in the php.ini
documentation. It says that the string is empty for earlier versions.
当我创建一个Java库以与PHP进行通信时,我需要知道在使用字符串时应该期待哪些值在内部作为字节处理。如果 default_charset
字符串为空,并且(文字)字符串包含ASCII范围以外的字符,会发生什么情况?我应该使用平台的默认字符编码还是源文件使用的字符编码?
As I am creating a Java library to communicate with PHP, I need to know which values I should expect when a string is handled as bytes internally. What happens if the default_charset
string is empty and a (literal) string contains characters outside the range of ASCII? Should I expect the default character encoding of the platform, or the character encoding used for the source file?
推荐答案
简短答案
对于文字字符串-始终是源文件编码。 default_charset
值在这里什么也不做。
Short answer
For literal strings -- always source file encoding. default_charset
value does nothing here.
PHP字符串是二进制安全的,这意味着它们没有任何内部字符串编码。
PHP strings are "binary safe" meaning they do not have any internal string encoding. Basically string in PHP are just buffers of bytes.
对于文字字符串,例如。 $ s =Ä
这意味着字符串将包含引号之间文件中保存的所有字节。如果文件保存在 UTF-8 中,则等同于 $ s = \xc3\x84
,如果文件保存在 ISO-8859-1 (拉丁语1),这等效于 $ s = \xc4
。
For literal strings e.g. $s = "Ä"
this means that string will contain whatever bytes were saved in file between quotes. If file was saved in UTF-8 this will be equivalent to $s = "\xc3\x84"
, if file was saved in ISO-8859-1 (latin1) this will be equivalent to $s = "\xc4"
.
设置 default_charset
值不会以任何方式影响存储在字符串中的字节。
Setting default_charset
value does not affect bytes stored in strings in any way.
某些函数必须将字符串作为 text 并且具有编码意识,请接受 $ encoding
作为参数(通常是可选的)。
Some functions, that have to deal with strings as text and are encoding aware, accept $encoding
as argument (usually optional). This tells the function what encoding the text is encoded in a string.
在PHP 5.6之前,这些可选的 $ encoding $ c $的默认值会告诉该函数。 c>参数要么在函数定义中(例如
htmlspecialchars()
),要么可以分别为每个扩展名在各种php.ini设置中配置(例如 mbstring.internal_encoding
, iconv.input_encoding
)。
Before PHP 5.6 default value of these optional $encoding
arguments were either in function definition (e.g. htmlspecialchars()
) or configurable in various php.ini settings for each extension separately (e.g. mbstring.internal_encoding
, iconv.input_encoding
).
在PHP 5.6中,新的php.ini设置<引入了code> default_charset 。不建议使用旧设置,并且当未指定编码时,所有接受可选 $ encoding
参数的函数现在应默认为 default_charset
值
In PHP 5.6 new php.ini setting default_charset
was introduced. Old settings were deprecated and all functions that accept optional $encoding
argument should now default to default_charset
value when encoding is not specified explicitly.
但是,开发人员有责任确保字符串中的文本实际上是使用指定的编码进行编码的。
However, developer is left responsible to make sure that text in string is actually encoded in encoding that was specified.
链接:
- 字符串类型的详细信息
有关PHP性质的更多详细信息字符串(撰写本文时未提及default_charset
)。 - PHP 5.6的新功能:默认字符编码
简介5.6发行说明中新的default_charset
选项的说明。 - PHP 5.6中已弃用的功能:iconv和mbstring编码设置
不推荐使用php.ini选项,而推荐使用default_chaset
选项。
- Details of the String Type
More details on nature of PHP strings (does not mentiondefault_charset
at the time of writing). - New features in PHP 5.6: Default character encoding
Short introduction of newdefault_charset
option in PHP 5.6 release notes. - Deprecated features in PHP 5.6: iconv and mbstring encoding settings
List of deprecated php.ini options in favour ofdefault_chaset
option.
这篇关于如果default_charset为空,字符集是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!