如何在PHP中将UTF16代理对转换为等效的HEX代码点? [英] How to convert UTF16 surrogate pairs to equivalent HEX codepoint in PHP?

查看:128
本文介绍了如何在PHP中将UTF16代理对转换为等效的HEX代码点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作一个应用程序,当聊天将从iOS应用程序发送时,但是管理员可以从PHP内置的管理控制台"中查看聊天记录.

I am making a application, when chat will be sent from iOS app, but the admin could view the chat from Admin panel which is built in PHP.

从数据库中,我将得到这样的聊天消息:

From DB, I will be getting chat messages like this:

Hi, Jax\ud83d\ude1b\ud83d\ude44! can we go for a coffee?

我正在使用twemoji ,该库可以将十六进制代码点转换为图像.

I am using twemoji library which can convert HEX code points into images.

让我们详细说一下,

在php部分,我有以下代码:-

In php section, I have following code:-

$text = "This is fun \u1f602! \u1f1e8 ";
$html = preg_replace("/\\\\u([0-9A-F]{2,5})/i", "&#x$1;", $text);
echo $html;

现在,twemoji解析HTML文档的整体,以将十六进制代码点替换为图像.

Now, the twemoji parses the total body of the HTML document to replace Hex code points to images.

window.onload = function() {

  // Set the size of the rendered Emojis
  // This can be set to 16x16, 36x36, or 72x72
  twemoji.size = '16x16';

  // Parse the document body and
  // insert <img> tags in place of Unicode Emojis
  twemoji.parse(document.body);
}

因此,我需要文本将所有UTF-16替换为HEX代码点(用于表情符号). 我该怎么做?

So, I need the text to replace all UTF-16 to HEX codepoints(for emojis). How can I do this?

推荐答案

在这里您遇到双重问题:

Here you have a double problem:

  • 检测到已编码一个代理对
  • 实际上将代理对转换为HTML实体

解释问题的复杂性远远超出了单个答案的范围(为此您必须阅读UTF-16),但是此代码片段似乎可以解决您的问题:

Explaining the complexities of the issue goes considerably outside the scope of a single answer (you'd have to read up on UTF-16 for this), but this code fragment seems to solve your problem:

$text = "Hi, Jax\\ud83d\\ude1b\\ud83d\\ude44! can we go for a coffee?";

$result = preg_replace_callback('/\\\\u(d[89ab][0-9a-f]{2})\\\\u(d[c-f][0-9a-f]{2})/i', function ($matches) {
    $first = $matches[1];
    $second = $matches[2];
    $value = ((eval("return 0x$first;") & 0x3ff) << 10) | (eval("return 0x$second;") & 0x3ff);
    $value += 0x10000;
    return "&#$value;";
  }, $text);

echo $result;

我知道几乎总是不鼓励使用eval,但是由于正则表达式匹配(在您知道该匹配仅包含十六进制数字的情况下),在本示例中这是绝对安全的.

I know that using eval is almost always discouraged, but it is perfectly safe in this example due to the regular expression matches (you know that the matches only contain hexadecimal digits).

这篇关于如何在PHP中将UTF16代理对转换为等效的HEX代码点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆