获取html实体的十六进制代码 [英] Get hexcode of html entities
问题描述
我有一个字符串为& euro;
。
我想将它转换为十六进制获取值为\\\€
,这样我就可以将它发送到flash。
货币符号..
& pound; - > \\\£
& dollar; - > \\\$
等
& dollar;
不是 HTML 4.01中的已知实体。但是,在HTML 5中,在PHP 5.4中,您可以使用 ENT_QUOTES |
来调用
来解码它。 html_entity_decode
ENT_HTML5
您必须解码实体,然后才能将其转换:
pre $ //假定$ str是UTF-8(或ASCII)
函数foo($ str){
$ dec = html_entity_decode($ str, ENT_QUOTES,UTF-8);
//转换为UTF-16BE
$ enc = mb_convert_encoding($ dec,UTF-16BE,UTF-8);
$ out =;
foreach(str_split($ enc,2)as $ f){
$。=\\u。 sprintf(%04X,ord($ f [0]))<<< 8 | ord($ f [1]));
}
返回$ out;
$ b $ p
$ b如果你只想替换实体,你可以使用 preg_replace_callback
来匹配实体,然后使用 foo
作为回调函数。
function($ m){return foo($ m [0]);},
$ str);
}
echo repl_only_ent(& euro; foobar& acute;);
给出:
\\ \\ u20ACfoobar \\\´
I have a string as "€
".
I want to convert it to hex to get the value as "\u20AC"
so that I can send it to flash.
Same for all currency symbol..
£ -> \u00A3
$ -> \u0024
etc
First, note that $
is not a known entity in HTML 4.01. It is, however, in HTML 5, and, in PHP 5.4, you can call html_entity_decode
with ENT_QUOTES | ENT_HTML5
to decode it.
You have to decode the entity and only then convert it:
//assumes $str is in UTF-8 (or ASCII)
function foo($str) {
$dec = html_entity_decode($str, ENT_QUOTES, "UTF-8");
//convert to UTF-16BE
$enc = mb_convert_encoding($dec, "UTF-16BE", "UTF-8");
$out = "";
foreach (str_split($enc, 2) as $f) {
$out .= "\\u" . sprintf("%04X", ord($f[0]) << 8 | ord($f[1]));
}
return $out;
}
If you want to replace only the entities, you can use preg_replace_callback
to match the entities and then use foo
as a callback.
function repl_only_ent($str) {
return preg_replace_callback('/&[^;]+;/',
function($m) { return foo($m[0]); },
$str);
}
echo repl_only_ent("€foobar ´");
gives:
\u20ACfoobar \u00B4
这篇关于获取html实体的十六进制代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!