正则表达式-完全匹配一个标签 [英] Regex - Matching exactly one single tag

查看：49 发布时间：2021/5/14 19:43:35 html regex

本文介绍了正则表达式-完全匹配一个标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个正则表达式可以从HTML字体标签中提取文本:

I have a regex to extract the text from an HTML font tag:

<FONT FACE=\"Excelsior LT Std Bold\"(.*)>(.*)</FONT>

在我有一些嵌套的字体标签之前，这种方法可以正常工作.而不是匹配

That's working fine until I have some nested font tags. Instead of matching

<FONT FACE="Excelsior LT Std Bold">Fett</FONT>

字符串的结果

<FONT FACE="Excelsior LT Std Bold">Fett</FONT> + <U>Unterstrichen</U> + <FONT FACE="Excelsior LT Std Italic">Kursiv</FONT> und Normal

是

<FONT FACE="Excelsior LT Std Bold">Fett</FONT> + <U>Unterstrichen</U> + <FONT FACE="Excelsior LT Std Italic"

我如何只获得第一个标签?

How do I get only the first tag?

推荐答案

您需要使用.*?而不是.* 取消贪婪匹配.

You need to disabale greedy matching with .*? instead of .*.

<FONT FACE=\"Excelsior LT Std Bold\"([^>]*)>(.*?)</FONT>

请注意，如果在< FONT>的 FACE 属性后的某个地方有类似 BadAttribute =< FooBar>" 的属性，则此操作将失败.标记.如果属性将包含</FONT> ，则这将混合两个匹配的组，并且可能完全混乱.因为正则表达式无法计算匹配的标签或引号，所以无法解决这一问题.因此，我绝对同意Tomalak-尽量避免使用正则表达式来处理XML，HTML和其他类似这样的标记语言.


Note that this will fail if there is a attribute like BadAttribute="<FooBar>" somewhere after the FACE attribute for the <FONT> tag. This will mix both matching groups and it could get completly messed up if an attribute would contain </FONT>. There is no way araound this because regular expressions cannot count matching tags or quotes. So I absolutly agree with Tomalak - try to avoid using regular expressions for processing XML, HTML, and other markup up languages like these.

                        这篇关于正则表达式-完全匹配一个标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

正则表达式-完全匹配一个标签 [英] Regex - Matching exactly one single tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

正则表达式-完全匹配一个标签 [英] Regex - Matching exactly one single tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭