如何将文字和图片拼凑在一起? [英] how to scrap text and image together?

查看：124 发布时间：2020/9/20 7:54:59 python-2.7 web-scraping beautifulsoup

本文介绍了如何将文字和图片拼凑在一起?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用beautifulSoup4开发网页抓取工具.我想获取文章的文本和图像，但是有一些问题！ html代码是这样的:

I'm working on a webpage scraper with beautifulSoup4. I want to get text and images of the article, but have some problems! html code is sth like this:

<div>
 some texts1
 <br />
 <img src="imgpic.jpg" />
 <br />
 some texts2
</div>

我得到了全文:

post_soup.get_text()

并照常使用urllib2将所有图像保存在div中最后我将它们保存在html页面中，然后将所有文本放在顶部，最后放置图像，但是我想将它们保存在新的html页面中，就像我抓取它们的页面一样，我的意思是先some texts1然后image然后

and save all images in div with urllib2 as usual finally I save them in a html page and put all text at top and images at last, but I want to save them in new html page just like the page I scraped them, I mean first some texts1 then image then some texts2

有什么建议吗?

推荐答案

这不是最佳和正确的方法，但是应该可以:

This is not the best and correct way, but it should work:

from bs4 import BeautifulSoup

html = "<div>\
 some texts1\
 <br />\
 <img src=\"imgpic.jpg\" />\
 <br />\
 some texts2\
</div>"

soup = BeautifulSoup(html)
text = "+".join(soup.stripped_strings).split("+")

print text[0]
print soup.find("img")['src']
print text[1]

输出:

some texts1
imgpic.jpg
some texts2

这篇关于如何将文字和图片拼凑在一起?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将文字和图片拼凑在一起? [英] how to scrap text and image together?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将文字和图片拼凑在一起? [英] how to scrap text and image together?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭