正则表达式匹配除 之外的所有 HTML 标签和 [英] Regex to match all HTML tags except and

查看：47 发布时间：2021/12/10 18:11:42 html regex perl

本文介绍了正则表达式匹配除 之外的所有 HTML 标签和的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在 Perl 中使用正则表达式匹配和删除所有标签.我有以下几点:

I need to match and remove all tags using a regular expression in Perl. I have the following:

<\??(?!p).+?>

但这仍然与结束  标签匹配.关于如何与结束标记匹配的任何提示?

But this still matches with the closing  tag. Any hint on how to match with the closing tag as well?

注意，这是在 xhtml 上执行的.

Note, this is being performed on xhtml.

推荐答案

我想出了这个:

<(?!/?p(?=>|s.*>))/?.*?>

x/
<           # Match open angle bracket
(?!         # Negative lookahead (Not matching and not consuming)
    /?     # 0 or 1 /
    p           # p
    (?=     # Positive lookahead (Matching and not consuming)
    >       # > - No attributes
        |       # or
    s      # whitespace
    .*      # anything up to 
    >       # close angle brackets - with attributes
    )           # close positive lookahead
)           # close negative lookahead
            # if we have got this far then we don't match
            # a p tag or closing p tag
            # with or without attributes
/?         # optional close tag symbol (/)
.*?         # and anything up to
>           # first closing tag
/

这将处理带有或不带有属性的 p 标签以及结束 p 标签，但会匹配带有或不带有属性的 pre 和类似标签.

This will now deal with p tags with or without attributes and the closing p tags, but will match pre and similar tags, with or without attributes.

它不会去除属性，但我的源数据没有将它们放入.我可能稍后会更改它以执行此操作，但现在就足够了.

It doesn't strip out attributes, but my source data does not put them in. I may change this later to do this, but this will suffice for now.

这篇关于正则表达式匹配除 之外的所有 HTML 标签和的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

正则表达式匹配除 <p> 之外的所有 HTML 标签和</p> [英] Regex to match all HTML tags except <p> and </p>

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

正则表达式匹配除 &lt;p&gt; 之外的所有 HTML 标签和&lt;/p&gt; [英] Regex to match all HTML tags except &lt;p&gt; and &lt;/p&gt;

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

正则表达式匹配除 <p> 之外的所有 HTML 标签和</p> [英] Regex to match all HTML tags except <p> and </p>

登录关闭