DOM 解析器，允许在 <script> 中使用 HTML5 样式的 </标签 [英] DOM parser that allows HTML5-style </ in <script> tag

查看：19 发布时间：2021/12/18 13:46:13 php dom html

本文介绍了DOM 解析器，允许在 <script> 中使用 HTML5 样式的 </标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

更新:html5lib(问题底部)似乎很接近了，我只需要提高我对它的使用方式的理解.

Update: html5lib (bottom of question) seems to get close, I just need to improve my understanding of how it's used.

我正在尝试为 PHP 5.3 寻找与 HTML5 兼容的 DOM 解析器.特别是，我需要在脚本标记中访问以下类似 HTML 的 CDATA:

I am attempting to find an HTML5-compatible DOM parser for PHP 5.3. In particular, I need to access the following HTML-like CDATA within a script tag:

<script type="text/x-jquery-tmpl" id="foo">
    <table><tr><td>${name}</td></tr></table>
</script>

大多数解析器将过早结束解析，因为 HTML 4.01 当它在之前的 code>.到目前为止，我尝试过的所有解析器要么都失败了，要么它们的文档很差，以至于我不知道它们是否有效.

Most parsers will end parsing prematurely because HTML 4.01 ends script tag parsing when it finds ETAGO (</) inside a <script> tag. However, HTML5 allows for </ before </script>. All of the parsers I have tried so far have either failed, or they are so poorly documented that I haven't figured out if they work or not.

我的要求:

真正的解析器，而不是正则表达式.
能够加载完整页面或 HTML 片段.
能够拉回脚本内容，通过标签的 id 属性选择.

Real parser, not regex hacks.
Ability to load full pages or HTML fragments.
Ability to pull script contents back out, selecting by the tag's id attribute.

输入:

<script id="foo"><td>bar</td></script>

失败输出示例(没有关闭</td>):

Example of failing output (no closing </td>):

<script id="foo"><td>bar</script>

一些解析器及其结果:

来源:

<?php

header('Content-type: text/plain');
$d = new DOMDocument;
$d->loadHTML('<script id="foo"><td>bar</td></script>');
echo $d->saveHTML();

输出:

Warning: DOMDocument::loadHTML(): Unexpected end tag : td in Entity, line: 1 in /home/adam/public_html/2010/10/26/dom.php on line 5
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><head><script id="foo"><td>bar</script></head></html>

来源:

<?php

header('Content-type: text/plain');
require_once 'FluentDOM/src/FluentDOM.php';
$html = "<html><head></head><body><script id='foo'><td></td></script></body></html>";
echo FluentDOM($html, 'text/html');

输出:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><head></head><body><script id="foo"><td></script></body></html>

来源:

<?php

header('Content-type: text/plain');

require_once 'phpQuery.php';

phpQuery::newDocumentHTML(<<<EOF
<script type="text/x-jquery-tmpl" id="foo">
<td>test</td>
</script>
EOF
);

echo (string)pq('#foo');

输出:

<script type="text/x-jquery-tmpl" id="foo">
<td>test
</script>

可能有希望.我可以获取 script#foo 标签的内容吗?

Possibly promising. Can I get at the contents of the script#foo tag?

来源:

<?php

header('Content-type: text/plain');

include 'HTML5/Parser.php';

$html = "<!DOCTYPE html><html><head></head><body><script id='foo'><td></td></script></body></html>";
$d = HTML5_Parser::parse($html);

echo $d->saveHTML();

输出:

<html><head></head><body><script id="foo"><td></td></script></body></html>

DOM 解析器，允许在 <script> 中使用 HTML5 样式的 </标签 [英] DOM parser that allows HTML5-style </ in <script> tag

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

DOM 解析器，允许在 <script> 中使用 HTML5 样式的 &lt;/标签 [英] DOM parser that allows HTML5-style &lt;/ in &lt;script&gt; tag

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

DOM 解析器，允许在 <script> 中使用 HTML5 样式的 </标签 [英] DOM parser that allows HTML5-style </ in <script> tag

登录关闭