如何将PHP的DOMDocument与JavaScript模板结合使用 [英] How to combine PHP's DOMDocument with a JavaScript template

查看:70
本文介绍了如何将PHP的DOMDocument与JavaScript模板结合使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里有一个奇怪的问题,但这完全使我难过。尽可能多的事情,这是因为我想不出要搜索的正确术语,所以这个问题可能会在StackOverflow上的某个地方得到解答,但我找不到。

I've got a bit of a strange question here, but it's stumped me completely. As much as anything, this is because I can't think of the correct terms to search for, so this question may well be answered on StackOverflow somewhere but I can't find it.

我们有一个校对系统,可让我们拍摄页面并对其进行注释。我们可以将页面发送给客户,他们可以在发送回信之前在页面上做笔记。在大多数情况下,这可以正常工作。当我们尝试使用类似于Handlebars的JavaScript模板系统时,就会出现问题。我们倾向于在页面上使用如下脚本模板:

We have a proofing system that allows us to take a page and annotate it. We can send the page to our clients and they can make notes on it before sending it back. For the most part, this works fine. The problem comes when we try to use a JavaScript template system, similar to Handlebars. We tend to have script templates on our page that look something like this:

<script type="client/template" id="foo-div">
<div>#foo#</div>
</script>

我们可以在脚本中使用它在模板中生成标记,替换 #foo#带有正确的数据。

We can use that in our scripts to generate the markup within the template, replacing #foo# with the correct data.

问题是当我们尝试将其放入打样系统时出现的。因为我们需要抓取页面以便可以在域中进行渲染,所以我们使用PHP的 DOMDocument 来解析HTML,以便我们可以轻松地对其进行修改(添加类似 target = _ blank 到外部链接等)。当我们尝试通过 DOMDocument 运行模板时,它奇怪地进行了解析(可能将其视为无效的XML),从而导致页面出现问题。为了更好地说明这一点,下面是一个PHP示例:

The problem comes when we try to put that into our proofing system. Because we need to scrape the page so we can render in on our domain we use PHP's DOMDocument to parse the HTML so we can modify it easily (adding things like target="_blank" to external links etc). When we try to run our templating through DOMDocument, it parses it strangely (probably seeing it as invalid XML) and that causes issues on the page. To better illustrate that, here's an example in PHP:

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

$html = '<!DOCTYPE html>'.
    '<html>'.
    '<head></head>'.
    '<body>'.
    '<script type="client/template" id="foo-div"><div>#foo#</div></script>'.
    '</body>'.
    '</html>';

$dom = new DOMDocument();

libxml_use_internal_errors(true);

try {
    $html = $dom->loadHTML($html);
} catch (Exception $e) {
    throw new Exception('Invalid HTML on the page has caused a parsing error');
}

if ($html === false) {
    throw new Exception('Unable to properly parse page');
}

$dom->preserveWhiteSpace = false;
$dom->formatOutput = false;

echo $dom->saveHTML();

此脚本生成的代码类似于下面的HTML,并且似乎不会引发任何异常。

This script produces code similar to the HTML below and doesn't seem to throw any exceptions.

<!DOCTYPE html>
<html>
<head></head>
<body><script type="client/template" id="foo-div"><div>#foo#</script></body>
</html>

我的问题是:有人知道我可以得到PHP 的方法吗? DOMDocument 留下模板模板 script 标签吗?是否可以使用设置或插件使 DOMDocument 查看带有<$的 script 标记的内容c $ c> type 属性作为纯文本,就像浏览器一样?

My question is: does anybody know of a way that I can get PHP's DOMDocument to leave the templating script tag alone? Is there a setting or plugin that I can use to make DOMDocument see the contents of a script tag with an type attribute as plain text, much like browsers do?

编辑

我最终选择了Alf Eaton的解决方案,或者将字符串解析为XML。但是,并非所有HTML标记都是自动关闭的,这会导致问题。如果有人遇到同一问题,我会在此处发布完整的解决方案:

I ended up going with Alf Eaton's solution or parsing the string as XML. However, not all the HTML tags were self-closed and that caused issues. I'm posting the complete solution here in-case anyone comes across the same issue:

/**
 * Inserts a new string into an old string at the specified position.
 * 
 * @param string $old_string Old string to modify.
 * @param string $new_string New string to insert.
 * @param int $position Position at which the new string should be inserted.
 * @return string Old string with new string inserted.
 * @see http://stackoverflow.com/questions/8251426/insert-string-at-specified-position
 */
function str_insert($old_string, $new_string, $position) {

    return substr($old_string, 0, $position) . $new_string .
        substr($old_string, $position);

}

/**
 * Inspects a string of HTML and closes any tags that need self-closing in order
 * to make the HTML valid XML.
 * 
 * @param string $html Raw HTML (potentially invalid XML)
 * @return string Original HTML with self-closing slashes added.
 */
function self_close($html) {

    $fixed = $html;
    $tags  = array('area', 'base', 'basefont', 'br', 'col', 'frame',
        'hr', 'img', 'input', 'link', 'meta', 'param');

    foreach ($tags as $tag) {

        $offset = 0;

        while (($offset = strpos($fixed, '<' . $tag, $offset)) !== false) {

            if (($close = strpos($fixed, '>', $offset)) !== false &&
                    $fixed[$close - 1] !== '/') {
                $fixed = str_insert($fixed, '/', $close);
            }

            $offset += 1; // Prevent infinite loops

        }

    }

    return $fixed;

}

// When parsing the original string:
$html = $dom->loadXML(self_close($html));


推荐答案

如果输入文档是有效的XML,则将其解析为XML而不是HTML将保留< script> 标记的内容:

If the input document is valid XML, parsing it as XML rather than HTML will preserve the contents of the <script> tags:

<?php

$html = <<<END
<!DOCTYPE html>
<html><body>
<script type="client/template" id="foo-div"><div>#foo#</div></script>
</body></html>
END;

$doc = new DOMDocument();
$doc->preserveWhiteSpace = true; // needs to be before loading, to have any effect
$doc->loadXML($html);
$doc->formatOutput = false;
print $doc->saveHTML();

// <!DOCTYPE html>
// <html><body>
// <script type="client/template" id="foo-div"><div>#foo#</div></script>
// </body></html>

这篇关于如何将PHP的DOMDocument与JavaScript模板结合使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆