json_encode() 非 utf-8 字符串? [英] json_encode() non utf-8 strings?
问题描述
所以我有一个字符串数组,所有字符串都使用系统默认的ANSI 编码并从 SQL 数据库中提取.所以有 256 个不同的可能字符字节值(单字节编码).
有没有办法让 json_encode()
工作并显示这些字符,而不必在我的所有字符串上使用 utf8_encode()
并以 utf8_encode()
结束代码>u0082?
So I have an array of strings, and all of the strings are using the system default ANSI encoding and were pulled from a SQL database. So there are 256 different possible character byte values (single byte encoding).
Is there a way I can get json_encode()
to work and display these characters instead of having to use utf8_encode()
on all of my strings and ending up with stuff like u0082
?
或者这是 JSON 的标准?
Or is that the standard for JSON?
推荐答案
有没有办法让 json_encode() 工作并显示这些字符,而不必在我的所有字符串上使用 utf8_encode() 并以u0082"之类的东西结尾?
Is there a way I can get json_encode() to work and display these characters instead of having to use utf8_encode() on all of my strings and ending up with stuff like "u0082"?
如果你有一个 ANSI 编码的字符串,使用 utf8_encode()
是处理这个问题的错误函数.您需要先将其从 ANSI 正确转换为 UTF-8.这肯定会减少 json 输出中像 u0082
这样的 Unicode 转义序列的数量,但从技术上讲,这些序列 对json有效,你一定不要害怕.
If you have an ANSI encoded string, using utf8_encode()
is the wrong function to deal with this. You need to properly convert it from ANSI to UTF-8 first. That will certainly reduce the number of Unicode escape sequences like u0082
from the json output, but technically these sequences are valid for json, you must not fear them.
json_encode
与 UTF-8
编码的字符串仅.如果您需要从 ANSI
编码的字符串成功创建有效的 json
,您需要先将其重新编码/转换为 UTF-8
.然后 json_encode
将按照文档工作.
json_encode
works with UTF-8
encoded strings only. If you need to create valid json
successfully from an ANSI
encoded string, you need to re-encode/convert it to UTF-8
first. Then json_encode
will just work as documented.
从 ANSI
转换编码(更准确地说,我假设您有一个 Windows-1252
编码的字符串,它很流行但被错误地称为 ANSI
) 到 UTF-8
你可以使用 mb_convert_encoding()
函数:
To convert an encoding from ANSI
(more correctly I assume you have a Windows-1252
encoded string, which is popular but wrongly referred to as ANSI
) to UTF-8
you can make use of the mb_convert_encoding()
function:
$str = mb_convert_encoding($str, "UTF-8", "Windows-1252");
PHP 中另一个可以转换字符串编码/字符集的函数叫做 iconv
基于 libiconv.您也可以使用它:
Another function in PHP that can convert the encoding / charset of a string is called iconv
based on libiconv. You can use it as well:
$str = iconv("CP1252", "UTF-8", $str);
utf8_encode() 的注意事项
utf8_encode()
只做适用于 Latin-1
,不适用于 ANSI
.因此,当您通过该函数运行该字符串时,您将销毁该字符串中的部分字符.
Note on utf8_encode()
utf8_encode()
does only work for Latin-1
, not for ANSI
. So you will destroy part of your characters inside that string when you run it through that function.
相关:什么是ANSI格式?
要对 json_encode()
返回的内容进行更细粒度的控制,请参阅 预定义常量列表(取决于 PHP 版本,包括 PHP 5.4,一些常量仍未记录在案,目前仅在源代码中可用).
For a more fine-grained control of what json_encode()
returns, see the list of predifined constants (PHP version dependent, incl. PHP 5.4, some constants remain undocumented and are available in the source code only so far).
正如您在评论中所写,将函数应用于数组时遇到问题,这里是一些代码示例.在使用 json_encode
之前,总是需要先更改编码.这只是一个标准的数组操作,对于 pdo::fetch()
一个 foreach
迭代的简单情况:
As you wrote in a comment that you have problems to apply the function onto an array, here is some code example. It's always needed to first change the encoding before using json_encode
. That's just a standard array operation, for the simpler case of pdo::fetch()
a foreach
iteration:
while($row = $q->fetch(PDO::FETCH_ASSOC))
{
foreach($row as &$value)
{
$value = mb_convert_encoding($value, "UTF-8", "Windows-1252");
}
unset($value); # safety: remove reference
$items[] = array_map('utf8_encode', $row );
}
这篇关于json_encode() 非 utf-8 字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!