美丽的汤:从html获取图片大小 [英] Beautiful Soup: get picture size from html

查看:74
本文介绍了美丽的汤:从html获取图片大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用Bueatiful Soup提取图片的宽度和高度.所有图片均具有相同的代码格式:

I want to extract the pictures' widths and heights using Bueatiful Soup. All pictures have the same code format:

<img src="http://somelink.com/somepic.jpg" width="200" height="100">

我可以轻松提取链接

for pic in soup.find_all('img'):
    print (pic['src'])

但是

for pic in soup.find_all('img'):
    print (pic['width'])

不适用于提取尺寸.我想念什么?

is not working for extracting sizes. What am I missing?

页面中的图片之一在html代码中没有宽度和高度.在初次发布时没有注意到这一点.因此,任何解决方案都必须考虑到这一点

One of the pictures in the page does not have the width and height in the html code. Did not notice this at the time of the initial post. So any solution must take this into account

推荐答案

如果指定了类似字典的属性访问权限,则它们也应适用于 width height .您可能会遇到没有显式设置这些属性的图像-在这种情况下,当前代码将引发 KeyError .您可以使用 get()并提供默认值:

The dictionary-like attribute access should work for width and height as well, if they are specified. You might encounter images that don't have these attributes explicitly set - your current code would throw a KeyError in this case. You can use get() and provide a default value instead:

for pic in soup.find_all('img'):
    print(pic.get('width', 'n/a'))

或者,您只能找到指定了 width height img 元素:

Or, you can find only img elements that have the width and height specified:

for pic in soup.find_all('img', width=True, height=True):
    print(pic['width'], pic['height']) 

这篇关于美丽的汤:从html获取图片大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆