attrMap和ATTRS在beautifulSoup之间的区别 [英] Difference between attrMap and attrs in beautifulSoup

查看:280
本文介绍了attrMap和ATTRS在beautifulSoup之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道什么之间的 attrMap ATTRS 中的 BeautifulSoup ?更具体地讲,这些标签具有 ATTRS 并具有 attrMap

 >>>汤= BeautifulSoup.BeautifulSoup(源)
>>>标签= soup.find(NAME ='输入')
>>>字典(tag.attrs)[型]
u'text
>>> tag.attrMap [型]
回溯(最近通话最后一个):
  文件<交互式输入>中,1号线,上述<&模块GT;
类型错误:'NoneType'对象不是脚标


解决方案

attrMap 字段是标签内部字段类。你不应该在你的code使用它。您应该改用

  =值标记[关键]
标签[关键] =价值

这在内部映射到 tag.attrMap [关键] ,但毕竟只有 __的GetItem __ __ setitem __ 已确保初始化 self.attrMap 。这是在 _getAttrMap 完成,这是没有用复杂的字典(self.attrs)电话。因此,对于你code你会使用

 >>> URL =htt​​p://stackoverflow.com/questions/8842224/
>>>汤= BeautifulSoup.BeautifulSoup(了urllib.urlopen(URL).read())
>>> soup.find(NAME ='输入')
>>>标签= soup.find(NAME ='输入')
>>>标记[型]
u'text

如果您要检查给定属性是否存在等,则必须使用

 尝试:
    标签[关键]
    #发现钥匙
除了KeyError异常:
    #键不present

 如果字典(tag.attrs)键:
    #发现钥匙
其他:
    #键不present

正如亚当指出,这是因为 __包含__ 标签方法搜索内容,而不是属性,因此,在标签更熟悉的不会做你所期望的。这种复杂性的出现是因为BeautifulSoup处理HTML标记反复属性。因此,一个法线贴图(字典)是不是很够,因为键可以重复。但是,如果你想检查是否存在的任何的具有给定名称键,然后在字典键(tag.attrs)会做正确的事

I would like to know whats the difference between attrMap and attrs in BeautifulSoup? To be more specific, which tags have attrs and which have attrMap?

>>> soup = BeautifulSoup.BeautifulSoup(source)
>>> tag = soup.find(name='input')
>>> dict(tag.attrs)['type']
u'text'
>>> tag.attrMap['type']
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'NoneType' object is not subscriptable

解决方案

The attrMap field is an internal field in the Tag class. You should not use it in your code. You should instead use

value = tag[key]
tag[key] = value

This maps internally to tag.attrMap[key], but only after __getitem__ and __setitem__ have made sure to initialize self.attrMap. This is done in _getAttrMap, which is nothing by a complicated dict(self.attrs) call. So for your code you'll use

>>> url = "http://stackoverflow.com/questions/8842224/"
>>> soup = BeautifulSoup.BeautifulSoup(urllib.urlopen(url).read())
>>> soup.find(name='input')
>>> tag = soup.find(name='input')
>>> tag['type']
u'text'

If you want to check for the existance of a given attribute, then you must use

try:
    tag[key]
    # found key
except KeyError:
    # key not present

or

if key in dict(tag.attrs):
    # found key
else:
    # key not present

As pointed out by Adam, this is because the __contains__ method on Tag searches the content, not the attributes, and so the more familiar key in tag doesn't do what you would expect. This complexity arises because BeautifulSoup handles HTML tags with repeated attributes. So a normal map (dictionary) isn't quite enough since the keys can be duplicated. But if you want to check if there is any key with a given name, then key in dict(tag.attrs) will do the right thing.

这篇关于attrMap和ATTRS在beautifulSoup之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆