打印 html 标签的开始 [英] print start of html tags

查看：67 发布时间：2021/7/6 20:58:04 regex

本文介绍了打印 html 标签的开始的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想打印出第一个具有属性的 html 标签

 test
<h2>test2</h2><div id="内容"></div><p>test3</p><div class="test"></div><div id="nav"></div><p>test3</p>

例如，给定上面的html我想打印

<div id="导航">

我尝试了这个，但我得到了打击结果:

="内容">="导航">

<小时>

导入重新file = open('test.html')测试 = file.read()行 = test.splitlines()b= re.findall(r'<?=.*?>',test)对于 b 中的 a:打印(一)

如何调整我的代码以获得正确的输出.

解决方案

你应该对 = 左边的任意数量的字符使用非贪婪匹配，所以:

r'<.*?=.*?>'

这将匹配一个 <，后跟最小字符数，然后是 =，然后是最小字符数，直到 >;.

你所拥有的:

r''

表示一个可选的<，后跟一个=，后跟任何直到>的字符串.由于 < 是可选的，并且只会在 就在= 之前匹配，因此您最终没有匹配到它.

I want to print out the first html tags thats has attributes

    <h1>test</h1>
    <h2>test2</h2>
    <div id="content"></div>
    <p>test3</p>
    <div class="test"></div>
    <div id="nav"></div>
    <p>test3</p>

for instance, given the above html I want to print

<div class="content">
<div id="nav">

I try this but I get the blow result instead:

="content">
="nav">

import re file = open('test.html') test = file.read() lines = test.splitlines() b= re.findall(r'<?=.*?>',test) for a in b: print(a)
how to I adjust my code to get the right output.
解决方案
You should use a non-greedy match for any number of characters to the left of the =, so:
r'<.*?=.*?>'
That will match a <, followed by a minimum number of characters, followed by a =, followed by the minimum number of characters until the >.

What you had:
r'<?=.*?>'
Means an optional <, followed by a =, followed by any string going up to the >. Since the < is optional and would only match if right before the =, you end up with no matches for it.

这篇关于打印 html 标签的开始的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

打印 html 标签的开始 [英] print start of html tags

问题描述

test

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

打印 html 标签的开始 [英] print start of html tags

问题描述

test

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭