使用 BeautifulSoup 仅打印 HTML 代码中的外部标签 [英] Printing only outer tags in HTML code using BeautifulSoup
问题描述
部分 HTML 代码如下
<a class="reserve" data-target="#myModal" data-toggle="modal"href="example.com" rel="nofollow"></a></td>我发现它使用
soup.find_all('td', class_='col2')
但是我不想提取代码的全部部分,而只想提取
可以使用 BeautifulSoup 吗?我知道我可以使用字符串来完成,但我只是好奇.
解决方案 您可以将 string
属性设置为空字符串 (''
):
html = """;<td class="col2"><a class="reserve";数据目标=#myModal";数据切换=模态"href=example.com"rel=nofollow"></a></td>"汤 = BeautifulSoup(html)x = 汤.find('td', class_='col2')x.string = ''打印(x)
输出
<td class="col2"></td>
以下是文档 的说明:
<块引用>如果你设置了标签的 .string
属性,标签的内容将被你提供的字符串替换
<块引用>
注意:如果标签包含其他标签,它们及其所有内容都将被销毁.
Part of whole HTML code looks as follows
<td class="col2">
<a class="reserve" data-target="#myModal" data-toggle="modal"
href="example.com" rel="nofollow"></a></td>
I found it using
soup.find_all('td', class_='col2')
However I would like to extract not the whole part of the code but only
<td class="col2"></td>
Is it possible using BeautifulSoup? I know I can do it using strings but I'm just curious.
解决方案 You could set the string
attribute to an empty string (''
):
html = """
<td class="col2">
<a class="reserve" data-target="#myModal" data-toggle="modal"
href="example.com" rel="nofollow"></a></td>
"""
soup = BeautifulSoup(html)
x = soup.find('td', class_='col2')
x.string = ''
print(x)
Output
<td class="col2"></td>
Here is what documentation tells about it:
If you set a tag's .string
attribute, the tag's contents are replaced with the string you give
Be careful: if the tag contained other tags, they and all their contents will be destroyed.
这篇关于使用 BeautifulSoup 仅打印 HTML 代码中的外部标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文
登录
关闭
扫码关注1秒登录
发送“验证码”获取
|
15天全站免登陆