如何将字符串转换为BeautifulSoup对象? [英] How to convert a String into a BeautifulSoup object?
问题描述
我正在尝试抓取新闻网站,我需要更改一个参数.我用下一个代码替换了它:
I'm trying to crawl a news website and I need to change one parameter. I changed it with replace with the next code:
while i < len(links):
conn = urllib.urlopen(links[i])
html = conn.read()
soup = BeautifulSoup(html)
t = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
n = str(t.find("div", attrs={'class':'entry cuerpo-noticias'}))
print(p)
问题在于"t"类型是字符串,并且带有属性的查找仅适用于类型<class 'BeautifulSoup.BeautifulSoup'>
.您知道如何将"t"转换为该类型吗?
The problem is that "t" type is string and find with attributes is only applicable to types <class 'BeautifulSoup.BeautifulSoup'>
. Do you know how can I convert "t" to that type?
推荐答案
只需在解析之前进行替换:
html = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
soup = BeautifulSoup(html, "html.parser")
请注意,也有可能(我什至会说首选)解析HTML,找到元素并修改
Note that it would also be possible (I would even say preferred) to parse the HTML, locate the element(s) and modify the attributes of a Tag
instance, e.g.:
soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]
请注意,class
是特殊的多-valued属性-这就是为什么我们将值设置为单个类的列表的原因.
Note that class
is a special multi-valued attribute - that's why we are setting the value to a list of individual classes.
演示:
from bs4 import BeautifulSoup
html = """
<div class="row bigbox container mi-df-local locked-single">test</div>
"""
soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]
print(soup.prettify())
现在看看如何更新div
元素类:
Now see how the div
element classes were updated:
<div class="row bigbox container mi-df-local single-local">
test
</div>
这篇关于如何将字符串转换为BeautifulSoup对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!