如何将unicode代码点转换为十六进制HTML实体？ [英] How do I convert unicode codepoints to hexadecimal HTML entities?

查看：135 发布时间：2018/6/15 13:13:54 php html unicode

本文介绍了如何将unicode代码点转换为十六进制HTML实体？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据文件（准确地说是Apple plist），它有 Unicode codepoints 像 \U00e8 和 \U2019 。我需要使用PHP将它们转换为有效的十六进制 HTML实体。

I have a data file (an Apple plist, to be exact), that has Unicode codepoints like \U00e8 and \U2019. I need to turn these into valid hexadecimal HTML entities using PHP.

我现在正在做的是一长串：

What I'm doing right now is a long string of:

$fileContents = str_replace("\U00e8", "è", $fileContents); $fileContents = str_replace("\U2019", "’", $fileContents);

这显然是可怕的。我可以使用正则表达式将 \U 和所有尾随的 0s 转换为& amp ; #x ，然后贴在尾部的; 上，但是这看起来也很笨拙。

Which is clearly dreadful. I could use a regular expression to convert the \U and all trailing 0s to &#x, then stick on the trailing ;, but that also seems heavy-handed.

是否有一种干净简单的方式来取得一个字符串，并将所有unicode代码点替换为HTML实体？

Is there a clean, simple way to take a string, and replace all the unicode codepoints to HTML entities?

推荐答案

您可以使用 preg_replace ：

You can use preg_replace:

preg_replace('/\\\\U0*([0-9a-fA-F]{1,5})/', '&#x\1;', $fileContents);

测试RE：

PS> 'some \U00e8 string with \U2019 embedded Unicode' -replace '\\U0*([0-9a-f]{1,5})','&#x$1;' some è string with ’ embedded Unicode

这篇关于如何将unicode代码点转换为十六进制HTML实体？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将unicode代码点转换为十六进制HTML实体？ [英] How do I convert unicode codepoints to hexadecimal HTML entities?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

如何将unicode代码点转换为十六进制HTML实体？ [英] How do I convert unicode codepoints to hexadecimal HTML entities?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭