python:从html获取图像链接 [英] python: get image link from html

查看:37
本文介绍了python:从html获取图像链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自这样的 html/rss 片段

From a html/rss snippet like this

[...]<div class="..." style="..."></div><p><a href="..."
<img alt="" heightt="" src="http://link.to/image"
width="" /></a><span style="">[...]

我想获取图像源链接http://link.to/image.jpg".我怎么能在python中做到这一点?谢谢.

I want to get the image src link "http://link.to/image.jpg". How can I do this in python? Thanks.

推荐答案

lxml 是工作的工具.

lxml is the tool for the job.

从网页中抓取所有图像就像这样简单:

To scrape all the images from a webpage would be as simple as this:

import lxml.html

tree = lxml.html.parse("http://example.com")
images = tree.xpath("//img/@src")

print images

给予:

['/_img/iana-logo-pageheader.png', '/_img/icann-logo-micro.png']

如果是 RSS 提要,您需要使用 lxml.etree 解析它.

If it was an RSS feed, you'd want to parse it with lxml.etree.

这篇关于python:从html获取图像链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆