如何将外部标签添加到BeautifulSoup对象 [英] How to add outer tag to BeautifulSoup object
问题描述
我试图将一个iframe的内容替换为一个BeautifulSoup对象。假设这是
s =
<!DOCTYPE html>
< html>
< body>
< iframe src =http://www.w3schools.com>
< p>您的浏览器不支持iframe。 ; / p>
< / iframe>
< / body>
< / html>
$ c
是原始的html正在被解析
dom = BeatifulSoup(s,'html.parser')
code> f = dom.find('iframe')
现在我只想替换iframe的内容BeautifulSoup对象,例如对象newBO。如果我做了
f.replace_with(newBO)
,它可以工作,但我失去了原始文件的层次结构,因为iframe标记已经消失。如果不是BeautifulSoup对象,我只有一个字符串,我可以做f.string ='只是一个字符串'
,并且会替换内容,但是如果我做f.string = newBO
我得到
TypeError:'NoneType'对象不可调用
所以我试图使用
/ bs4 / doc /#extractrel =nofollow> extract 然后 dom (),replace_with
但添加一个iframe
标签到newBO。我怎样才能做到这一点?你能推荐一些其他的方式吗?
f = dom.find('iframe')
for f.find_all():
ele.extract()
new = BeautifulSoup(< div> foo< / div>)find(div)
f.insert(0,new)
print(dom)
哪个会给你:
<!DOCTYPE html>
< html>
< body>
< iframe src =http://www.w3schools.com>< div> foo< / div>
< / iframe>
< / body>
< / html>
还要删除任何字符串集 f.string =
:
f = dom.find('iframe')
for ele f.find_all():
print(type(ele))
ele.extract()
f.string =
new = BeautifulSoup(< div> foo< ; / div>,html.parser)。find(div)
f.insert(0,new)
print(dom)
然后给你:
< !DOCTYPE html>
< html>
< body>
< iframe src =http://www.w3schools.com>< div> foo< / div>< / iframe>
< / body>
< / html>
在这种情况下,您也可以使用
f.append(new) code>,因为它将是唯一的元素。
I am trying to replace the content of an iframe a BeautifulSoup object. Let say this
s=""" <!DOCTYPE html> <html> <body> <iframe src="http://www.w3schools.com"> <p>Your browser does not support iframes.</p> </iframe> </body> </html> """
is the original html being parsed with
dom = BeatifulSoup(s, 'html.parser')
and I get the iframe with
f = dom.find('iframe')
Now I want to replace only the content of the iframe with another BeautifulSoup object, eg the object newBO. If I do
f.replace_with(newBO)
it works but I lose the hierarchy of the original file because the iframe tag is gone. If instead of a BeautifulSoup object I had just a string I could dof.string = 'just a string'
and that would replace the content, but if I dof.string = newBO
I get
TypeError: 'NoneType' object is not callable
So I am trying to use the
replace_with
but add aniframe
tag to the newBO. How can I do that? Can you suggest some other way?解决方案extract the content then insert:
from bs4 import BeautifulSoup dom = BeautifulSoup(s, 'html.parser') f = dom.find('iframe') for ele in f.find_all(): ele.extract() new = BeautifulSoup("<div>foo</div>").find("div") f.insert(0, new) print(dom)
Which would give you:
<!DOCTYPE html> <html> <body> <iframe src="http://www.w3schools.com"><div>foo</div> </iframe> </body> </html>
To also remove any string set
f.string=""
:f = dom.find('iframe') for ele in f.find_all(): print(type(ele)) ele.extract() f.string = "" new = BeautifulSoup("<div>foo</div>","html.parser").find("div") f.insert(0, new) print(dom)
Which would then give you:
<!DOCTYPE html> <html> <body> <iframe src="http://www.w3schools.com"><div>foo</div></iframe> </body> </html>
In this case you could also use
f.append(new)
as it is going to be the only element.这篇关于如何将外部标签添加到BeautifulSoup对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!