html_entity_decode终止? [英] html_entity_decode Terminate?
问题描述
我使用 html_entity_decode($ row ['Content'])
显示一些 JSON
数据,其中包含 HTML
在 PHP
文档中。问题是,一些返回的数据已打开HTML标记,如< strong>
,然后继续显示之后显示的内容。
是否有某种方法可以终止HTML?解析方案
如果您从外部接受原始HTML来源嵌入您的网站,您应始终 总是 ,重新格式化并将其列入白名单。你不知道第三方HTML可能包含什么,并且你不能保证它是有效的;然而在你的网站上,你可能想要保证有效的HTML在其内容上有一定的限制(或者你真的想要嵌入任意< script>
标签...? !)。
这意味着您需要:
- 解析HTML和提取其中的任何结构信息
- 过滤该结构以仅允许批准的元素,然后从您可以保证的元素中生成您自己的HTML语法上有效。
假设最好的PHP库是 HTML净化器。在不使用库的情况下,您可以使用宽松的HTML解析器,如 DOMDocument
来检查和过滤内容,然后使用内置的 DOMDocument: :saveXML
生成新的清理过的HTML。
I'm using html_entity_decode($row['Content'])
to display some JSON
data that contains HTML
in a PHP
document. Problem is that some of the data being returned has open HTML tags such as <strong>
which then carry on to the content displayed after.
Is there some way to terminate the HTML?
If you ever accept raw HTML from an outside source to embed into your site, you should always, always, reformat and whitelist it. You have no idea what that 3rd party HTML may contain, and you have no guarantee that it's valid; yet on your site you presumably want guaranteed valid HTML with certain limits on its content (or do you really want to enable the embedding of arbitrary <script>
tags...?!).
That means you want to:
- parse the HTML and extract whatever structural information is in it
- filter that structure to allow only approved elements and then
- produce your own HTML from that which you can guarantee is syntactically valid.
Supposedly the best PHP library which does that is HTML Purifier. Without using a library, you would use a lenient HTML parser, something like DOMDocument
to inspect and filter the content, and then the built-in DOMDocument::saveXML
to produce the new sanitised HTML.
这篇关于html_entity_decode终止?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!