UTF-8 到 Unicode 代码点 [英] UTF-8 to Unicode Code Points

查看:27
本文介绍了UTF-8 到 Unicode 代码点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有将 UTF-8 转换为 Unicode 而将非特殊字符保留为普通字母和数字的函数?

Is there a function that will change UTF-8 to Unicode leaving non special characters as normal letters and numbers?

即德语单词tchüß"将被呈现为类似于tch20AC21AC"(请注意我正在制作 Unicode 代码).

ie the German word "tchüß" would be rendered as something like "tch20AC21AC" (please note that I am making the Unicode codes up).

我正在试验以下函数,但尽管这个函数适用于 ASCII 32-127,但对于双字节字符似乎失败了:

I am experimenting with the following function, but although this one works well with ASCII 32-127, it seems to fail for double byte chars:

function strToHex ($string)
{
    $hex = '';
    for ($i = 0; $i < mb_strlen ($string, "utf-8"); $i++)
    {
        $id = ord (mb_substr ($string, $i, 1, "utf-8"));
        $hex .= ($id <= 128) ? mb_substr ($string, $i, 1, "utf-8") : "&#" . $id . ";";
}

    return ($hex);
}

有什么想法吗?

编辑 2:找到解决方案:PHP ord() 函数不适用于双字节字符.改用:http://nl.php.net/manual/en/function.ord.php#78032

EDIT 2: Found solution: The PHP ord() function does not work for double byte chars. Use instead: http://nl.php.net/manual/en/function.ord.php#78032

推荐答案

可以使用 iconv 将一种字符集转换为另一种字符集:

Converting one character set to another can be done with iconv:

http://php.net/manual/en/function.iconv.php

请注意,UTF 已经是 Unicode 编码了.

Note that UTF is already an Unicode encoding.

另一种方法是简单地使用具有正确字符集的 htmlentities:

Another way is simply using htmlentities with the right character set:

http://php.net/manual/en/function.htmlentities.php

这篇关于UTF-8 到 Unicode 代码点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆