关闭字符串中打开的HTML标记 [英] Close open HTML tags in a string

查看:88
本文介绍了关闭字符串中打开的HTML标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况是一个字符串,其结果如下:

Situation is a string that results in something like this:

<p>This is some text and here is a <strong>bold text then the post stop here....</p>

由于该函数返回文本的预告片(摘要),因此它在某些单词之后停止.在这种情况下,强标签没有关闭.但是整个字符串都包裹在一个段落中.

Because the function returns a teaser (summary) of the text, it stops after certain words. Where in this case the tag strong is not closed. But the whole string is wrapped in a paragraph.

是否可以将上述结果/输出转换为以下内容:

Is it possible to convert the above result/output to the following:

<p>This is some text and here is a <strong>bold text then the post stop here....</strong></p>

我不知道从哪里开始.问题是..我在网上找到一个执行正则表达式的函数,但是它将结束标记放在字符串后面..因此它无法验证,因为我希望在段落标记中包含所有打开/关闭标记.我发现的功能这样做也是错误的:

I do not know where to begin. The problem is that.. I found a function on the web which does it regex, but it puts the closing tag after the string.. therefore it won't validate because I want all open/close tags within the paragraph tags. The function I found does this which is wrong also:

<p>This is some text and here is a <strong>bold text then the post stop here....</p></strong>

我想知道标签可以是坚固的,斜体的.这就是为什么我无法追加功能并在功能中手动关闭它的原因.有什么模式可以帮我吗?

I want to know that the tag can be strong, italic, anything. That's why I cannot append the function and close it manually in the function. Any pattern that can do it for me?

推荐答案

这是我以前使用过的功能,效果很好:

Here is a function i've used before, which works pretty well:

function closetags($html) {
    preg_match_all('#<(?!meta|img|br|hr|input\b)\b([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $html, $result);
    $openedtags = $result[1];
    preg_match_all('#</([a-z]+)>#iU', $html, $result);
    $closedtags = $result[1];
    $len_opened = count($openedtags);
    if (count($closedtags) == $len_opened) {
        return $html;
    }
    $openedtags = array_reverse($openedtags);
    for ($i=0; $i < $len_opened; $i++) {
        if (!in_array($openedtags[$i], $closedtags)) {
            $html .= '</'.$openedtags[$i].'>';
        } else {
            unset($closedtags[array_search($openedtags[$i], $closedtags)]);
        }
    }
    return $html;
} 

不过,就个人而言,我不会使用regexp而是使用Tidy之类的库来执行此操作.这将类似于以下内容:

Personally though, I would not do it using regexp but a library such as Tidy. This would be something like the following:

$str = '<p>This is some text and here is a <strong>bold text then the post stop here....</p>';
$tidy = new Tidy();
$clean = $tidy->repairString($str, array(
    'output-xml' => true,
    'input-xml' => true
));
echo $clean;

这篇关于关闭字符串中打开的HTML标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆