Python Beautifulsoup img标签解析 [英] Python Beautifulsoup img tag parsing

查看:617
本文介绍了Python Beautifulsoup img标签解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用beautifulsoup解析"www.youtube.com"中存在的所有img标签

I am using beautifulsoup to parse all img tags which is present in 'www.youtube.com'

代码是

import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen('http://www.youtube.com/')
soup = BeautifulSoup(page)
tags=soup.findAll('img')

但是我并没有获取所有的img标签.获取的img标签也无效.

But am not getting all img tags.The getting img tags are invalid also.

我在解析后得到的img标签不同于页面源img标签.缺少某些属性.

The img tags which i got after parsing is different from the page source img tags. Some attributes are missing.

我需要在youtube.com中获取所有视频img标签

I need to get all video img tags in youtube.com

请帮助

推荐答案

在这里尝试使用时似乎可以正常工作

Seems to work when I try it here

import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen('http://www.youtube.com/')
soup = BeautifulSoup(page)
tags=soup.findAll('img')
print "\n".join(set(tag['src'] for tag in tags))

生产对我来说看起来不错的

Produces this which looks OK to me

http://i1.ytimg.com/vi/D9Zg67r9q9g/market_thumb.jpg?v=723c8e
http://s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif
//s.ytimg.com/yt/img/pixel-vfl3z5WfW.gif
/gen_204?a=fvhr&v=mha7pAOfqt4&nocache=1337083207.97
http://i3.ytimg.com/vi/fNs8mf2OdkU/market_thumb.jpg?v=4f85544b
http://i4.ytimg.com/vi/CkQFjyZCq4M/market_thumb.jpg?v=4f95762c
http://i3.ytimg.com/vi/fzD5gAecqdM/market_thumb.jpg?v=b0cabf
http://i3.ytimg.com/vi/2M3pb2_R2Ng/market_thumb.jpg?v=4f0d95fa
//i2.ytimg.com/vi/mha7pAOfqt4/hqdefault.jpg

这篇关于Python Beautifulsoup img标签解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆