如何使用PHP防止DOM实体的htmlDocument :: saveHTML()? [英] How can I prevent html entities with PHP a DOMDocument::saveHTML()?
问题描述
由于自定义存储需求(为什么在这里并不重要,谢谢!)我必须将html < a>
链接保存为特定格式,例如此:
myDOMNode-> setAttribute( href, {{{123456}}}));
一切正常,直到我调用 saveHTML()
在包含的DOMDocument上。这杀死了它,因为它在%7B
中编码 {
。
这是旧版应用程序,其中href = {{{123456}}}用作占位符。命令行解析器会完全(未编码)查找此模式,并且无法更改。
我别无选择,只能这样做。
我无法对结果进行htmldecode()。
此HTML永远不会像这样显示,这只是存储需求。 / p>
谢谢您的帮助!
注意:我已经逛了2个小时,但是没有建议的解决方案为我工作。对于那些盲目地将问题标记为重复的人:请发表评论并让我知道。
由于旧版代码正在使用 {{{...}}}
作为占位符,在 preg_replace_callback 。一旦生成HTML,以下内容将还原URL编码的占位符:
$ src =<< EOS
< html>
< body>
< a href = foo> Bar< / a>
< / body>
< / html>
EOS;
//创建DOM文档
$ dom = new DOMDocument();
$ dom-> loadHTML($ src);
//更改锚点
的'href'属性$ a = $ dom-> getElementsByTagName('a')
-> item(0)
-> setAttribute('href','{{{123456}}}');
// URL解码的回调函数
$ urldecode = function($ matches){
return urldecode($ matches [0]);
};
//将DOMDocument转换为HTML字符串,然后还原/ URL解码占位符
$ html = preg_replace_callback(
'/'。urlencode('{{{')。'\ d +'。urlEncode('}}}')。'/',
$ urldecode,
$ dom-> saveHTML()
);
echo $ html,PHP_EOL;
输出(为简明起见):
<!DOCTYPE html PUBLIC-// W3C // DTD HTML 4.0 Transitional // EN http://www.w3.org/TR/REC-html40/loose。 dtd>
< html>
< body>
< a href = {{{123456}}}> Bar< / a>
< / body>
< / html>
Due to custom storage needs (the "why" is not important here, thanks!) I have to save html <a>
links in a specific format such as this:
myDOMNode->setAttribute("href", "{{{123456}}}");
Everything works fine until i call saveHTML()
on the containing DOMDocument. This kills it, since it encodes {
in %7B
.
This is a legacy application where href="{{{123456}}}" works as a placeholder. The command-line parser look for this pattern exactly (unencoded) and cannot be changed.
I've no choice but to do it this way.
I cannot htmldecode() the result.
This HTML will never be displayed as this, it is just a storage need.
Thanks for your help!
Note: I've looked around for 2 hours but none of the proposed solution worked for me. For those who will blindly mark the question as duplicate: please comment and let me know.
As the legacy code is using {{{...}}}
as a placeholder, it may be safe to use a somewhat hackish approach with preg_replace_callback. The following will restore the URL encoded placeholders once the HTML is generated:
$src = <<<EOS
<html>
<body>
<a href="foo">Bar</a>
</body>
</html>
EOS;
// Create DOM document
$dom = new DOMDocument();
$dom->loadHTML($src);
// Alter `href` attribute of anchor
$a = $dom->getElementsByTagName('a')
->item(0)
->setAttribute('href', '{{{123456}}}');
// Callback function to URL decode match
$urldecode = function ($matches) {
return urldecode($matches[0]);
};
// Turn DOMDocument into HTML string, then restore/urldecode placeholders
$html = preg_replace_callback(
'/' . urlencode('{{{') . '\d+' . urlEncode('}}}') . '/',
$urldecode,
$dom->saveHTML()
);
echo $html, PHP_EOL;
Output (indented for clarity):
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<body>
<a href="{{{123456}}}">Bar</a>
</body>
</html>
这篇关于如何使用PHP防止DOM实体的htmlDocument :: saveHTML()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!