如何将外部标签添加到BeautifulSoup对象 [英] How to add outer tag to BeautifulSoup object

查看:110
本文介绍了如何将外部标签添加到BeautifulSoup对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将一个iframe的内容替换为一个BeautifulSoup对象。假设这是

  s =
<!DOCTYPE html>
< html>
< body>

< iframe src =http://www.w3schools.com>
< p>您的浏览器不支持iframe。 ; / p>
< / iframe>

< / body>
< / html>


是原始的html正在被解析

  dom = BeatifulSoup(s,'html.parser')

code> f = dom.find('iframe')



现在我只想替换iframe的内容BeautifulSoup对象,例如对象newBO。如果我做了 f.replace_with(newBO)
,它可以工作,但我失去了原始文件的层次结构,因为iframe标记已经消失。如果不是BeautifulSoup对象,我只有一个字符串,我可以做 f.string ='只是一个字符串',并且会替换内容,但是如果我做 f.string = newBO



我得到


TypeError:'NoneType'对象不可调用

所以我试图使用 replace_with 但添加一个 iframe 标签到newBO。我怎样才能做到这一点?你能推荐一些其他的方式吗?

/ bs4 / doc /#extractrel =nofollow> extract 然后 dom (),

f = dom.find('iframe')
for f.find_all():
ele.extract()
new = BeautifulSoup(< div> foo< / div>)find(div)
f.insert(0,new)
print(dom)

哪个会给你:

 <!DOCTYPE html> 

< html>
< body>
< iframe src =http://www.w3schools.com>< div> foo< / div>

< / iframe>
< / body>
< / html>

还要删除任何字符串集 f.string =

  f = dom.find('iframe')

for ele f.find_all():
print(type(ele))
ele.extract()
f.string =
new = BeautifulSoup(< div> foo< ; / div>,html.parser)。find(div)
f.insert(0,new)
print(dom)


然后给你:

 < !DOCTYPE html> 

< html>
< body>
< iframe src =http://www.w3schools.com>< div> foo< / div>< / iframe>
< / body>
< / html>

在这种情况下,您也可以使用 f.append(new) code>,因为它将是唯一的元素。


I am trying to replace the content of an iframe a BeautifulSoup object. Let say this

 s="""
 <!DOCTYPE html>
 <html>
 <body>

 <iframe src="http://www.w3schools.com">         
   <p>Your browser does not support iframes.</p>
 </iframe>

 </body>
 </html>
 """

is the original html being parsed with

dom = BeatifulSoup(s, 'html.parser')

and I get the iframe with f = dom.find('iframe')

Now I want to replace only the content of the iframe with another BeautifulSoup object, eg the object newBO. If I do f.replace_with(newBO) it works but I lose the hierarchy of the original file because the iframe tag is gone. If instead of a BeautifulSoup object I had just a string I could do f.string = 'just a string' and that would replace the content, but if I do f.string = newBO

I get

TypeError: 'NoneType' object is not callable

So I am trying to use the replace_with but add an iframe tag to the newBO. How can I do that? Can you suggest some other way?

解决方案

extract the content then insert:

from bs4 import BeautifulSoup
dom = BeautifulSoup(s, 'html.parser')

f = dom.find('iframe')
for ele in f.find_all():
    ele.extract()
new = BeautifulSoup("<div>foo</div>").find("div")
f.insert(0, new)
print(dom)

Which would give you:

 <!DOCTYPE html>

<html>
<body>
<iframe src="http://www.w3schools.com"><div>foo</div>

</iframe>
</body>
</html>

To also remove any string set f.string="":

f = dom.find('iframe')

for ele in f.find_all():
    print(type(ele))
    ele.extract()
f.string = ""
new = BeautifulSoup("<div>foo</div>","html.parser").find("div")
f.insert(0, new)
print(dom)

Which would then give you:

<!DOCTYPE html>

<html>
<body>
<iframe src="http://www.w3schools.com"><div>foo</div></iframe>
</body>
</html>

In this case you could also use f.append(new) as it is going to be the only element.

这篇关于如何将外部标签添加到BeautifulSoup对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆