如何使用php将文本转换为\ u0054 \ u0068 \ u0069 \ u0073之类的Unicode代码点? [英] How to convert text to unicode code point like \u0054\u0068\u0069\u0073 using php?

查看:377
本文介绍了如何使用php将文本转换为\ u0054 \ u0068 \ u0069 \ u0073之类的Unicode代码点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用php5将英语单词转换为unicode数字,然后生成为\ u * * * *,其中* * * *是unicode数字.

EDIT 2: I'd like to convert English words to unicode numbers using php5 and then produced as \u* * * * where * * * * is the unicode number.

在我最初提出的问题中,我错误地认为\ u是编码unicode的标准,而实际上它只是在JavaScript中被转义了(thankyou Jukka K. Korpela指出了这一点).即使我想用PHP进行转换,转换后的unicode也要在JavaScript中使用.

In my original question, I had mistakenly thought that \u was a standard for encoding unicode when in fact it is just being escaped in JavaScript ( Thankyou Jukka K. Korpela for pointing this out). Even though I wanted to do the conversion in PHP the converted unicode was to be used in JavaScript.

我尝试了以下选项,但是没有运气. deceze的回答虽然成功了,但非常感谢!

I tried the below options, but had no luck. deceze's answer did the trick though, thank you very much!

我尝试过的事情

我已经阅读到可以使用iconv来执行此操作,但是我没有运气,也找不到任何有关此操作的示例.

I've read that I can use iconv to do this, but I've had no luck and can't find any examples on how.

我还在这里尝试过Scott Reynen的代码

I've also tried Scott Reynen's code here How to get code point number for a given character in a utf-8 string? but I can't seem to get it to work. When I tried it I included the script in a file along with

$str='test';
echo utf8_to_unicode($str);

它只是呼应test.

我还读到可以使用

echo json_encode("test");

但是我只能在屏幕上打印test.

but again I only get test printed to the screen.

任何帮助将不胜感激.

实际上,我认为它们被称为代码单位,而不是代码点.

Actually I think they are called code units not code points.

推荐答案

json_encode几乎可以为您完成此操作,但仅适用于非ASCII字符.因此,您要做的就是手动转换ASCII字符.以下是一个函数,该函数可以逐个字符地进行操作:

json_encode pretty much does it for you, but only for non-ASCII characters. So all you need to do is to convert ASCII characters by hand. Here's a function that does that on a character-by-character basis:

function utf8ToUnicodeCodePoints($str) {
    if (!mb_check_encoding($str, 'UTF-8')) {
        trigger_error('$str is not encoded in UTF-8, I cannot work like this');
        return false;
    }
    return preg_replace_callback('/./u', function ($m) {
        $ord = ord($m[0]);
        if ($ord <= 127) {
            return sprintf('\u%04x', $ord);
        } else {
            return trim(json_encode($m[0]), '"');
        }
    }, $str);
}

这篇关于如何使用php将文本转换为\ u0054 \ u0068 \ u0069 \ u0073之类的Unicode代码点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆