查找鉴于命名空间属性的所有元素 [英] Find All Elements Given Namespaced Attribute

查看:149
本文介绍了查找鉴于命名空间属性的所有元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有这样的事情:

<p>blah</p>
<p foo:bar="something">blah</p>
<p foo:xxx="something">blah</p>

我将如何得到beautifulsoup与foo的命名空间的属性选择元素?

How would I get beautifulsoup to select elements with an attribute of the foo namespace?

例如。我想返回的第二和第三p元素。

E.g. I would like the 2nd and 3rd p elements returned.

推荐答案

BeautifulSoup(包括版本3和4)不出现治疗namespace- preFIX作为什么特别。它只是把寿namespace- preFIX和命名空间属性,因为这恰好有其名称中的冒号的属性。

BeautifulSoup (both version 3 and 4) does not appear to treat the namespace-prefix as anything special. It just treats tho namespace-prefix and namespaced attribute as an attribute that happens to have a colon in its name.

因此​​,要找到为&LT; P&GT; 与在命名空间属性的元素,你只需要循环通过所有的属性键,如果检查attr.startswith('富')

So to find as <p> elements with attributes in the foo namespace, you just have to loop through all the attribute keys and check if attr.startswith('foo'):

import BeautifulSoup as bs
content = '''\
<p>blah</p>
<p foo:bar="something">blah</p>
<p foo:xxx="something">blah</p>'''

soup = bs.BeautifulSoup(content)
for p in soup.find_all('p'):
    for attr in p.attrs.keys():
        if attr.startswith('foo'):
            print(p)
            break

收益

<p foo:bar="something">blah</p>
<p foo:xxx="something">blah</p>


使用 LXML 可以通过XPath,它确实有被命名搜索属性语法的支持搜索:


With lxml you can search by XPath, which does have syntax support for searching for attributes by namespace:

import lxml.etree as ET
content = '''\
<root xmlns:foo="bar">
<p>blah</p>
<p foo:bar="something">blah</p>
<p foo:xxx="something">blah</p></root>'''

root = ET.XML(content)
for p in root.xpath('p[@foo:*]', namespaces={'foo':'bar'}):
    print(ET.tostring(p))

收益

<p xmlns:foo="bar" foo:bar="something">blah</p>
<p xmlns:foo="bar" foo:xxx="something">blah</p>

这篇关于查找鉴于命名空间属性的所有元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆