基于部分属性值的美丽汤查找标签 [英] Beautiful Soup Find Tags based on partial attribute value
问题描述
我正在尝试根据部分属性值识别html文档中的标签.
I am trying to identify tags in an html document based on part of the attribute value.
例如,如果我有一个Beautifulsoup对象:
For example, if I have a Beautifulsoup object:
import bs4 as BeautifulSoup
r = requests.get("http:/My_Page")
soup = BeautifulSoup(r.text, "html.parser")
我想要具有id
属性的tr
标签,其值的格式如下:"news_4343_23255_xxx".我对任何tr
标记感兴趣,只要它具有id
属性值的前4个字符即可.
I want tr
tags with id
attribute whose values are formatted like this: "news_4343_23255_xxx". I'm interested in any tr
tag as long as it has "news" as the first 4 characters of the id
attribute value.
我知道我可以进行以下搜索:
I know I can search as follows:
trs = soup.find_all("tr",attrs={"id":True})
这给了我所有具有id
属性的tr
标签.
which gives me all tr
tages with an id
attribute.
如何根据子字符串搜索?
How do I seach based on a substring?
推荐答案
使用 regex 获取以id
开头的tr
,以"news"
Use regex to get tr
with id
starting with "news"
例如:
from bs4 import BeautifulSoup
import re
soup = BeautifulSoup(html, "html.parser")
for i in soup.find_all("tr", {'id': re.compile(r'^news')}):
print(i)
这篇关于基于部分属性值的美丽汤查找标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!