如何使用属性获取lxml中所有元素的路径 [英] How to get path of all elements in lxml with attribute
本文介绍了如何使用属性获取lxml中所有元素的路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下代码:
tree = etree.ElementTree(new_xml)
for e in new_xml.iter():
print tree.getpath(e), e.text
这将为我提供以下信息:
This will give me something like the following:
/Item/Purchases
/Item/Purchases/Purchase[1]
/Item/Purchases/Purchase[1]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[1]/Rating R
/Item/Purchases/Purchase[2]
/Item/Purchases/Purchase[2]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[2]/Rating R
但是,我需要获取的路径不是列表元素的路径,而是属性的路径.这是xml的样子:
However, I need to get the path not of the list element but of the attribute. Here is what the xml looks like:
<Item>
<Purchases>
<Purchase Country="US">
<URL>http://tvgo.xfinity.com/watch/x/6091165US</URL>
<Rating>R</Rating>
</Purchase>
<Purchase Country="CA">
<URL>http://tvgo.xfinity.com/watch/x/6091165CA</URL>
<Rating>R</Rating>
</Purchase>
</Item>
我将如何获取以下路径?
How would I get the following path instead?
/Item/Purchases
/Item/Purchases/Purchase[@Country="US"]
/Item/Purchases/Purchase[@Country="US"]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[@Country="US"]/Rating R
/Item/Purchases/Purchase[@Country="CA"]
/Item/Purchases/Purchase[@Country="CA"]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[@Country="CA"]/Rating R
推荐答案
虽然不漂亮,但是可以完成工作.
Not pretty, but it does the job.
replacements = {}
for e in tree.iter():
path = tree.getpath(e)
if re.search('/Purchase\[\d+\]$', path):
new_predicate = '[@Country="' + e.attrib['Country'] + '"]'
new_path = re.sub('\[\d+\]$', new_predicate, path)
replacements[path] = new_path
for key, replacement in replacements.iteritems():
path = path.replace(key, replacement)
print path, e.text.strip()
为我打印此内容:
/Item
/Item/Purchases
/Item/Purchases/Purchase[@Country="US"]
/Item/Purchases/Purchase[@Country="US"]/URL http://tvgo.xfinity.com/watch/x/6091165US
/Item/Purchases/Purchase[@Country="US"]/Rating R
/Item/Purchases/Purchase[@Country="CA"]
/Item/Purchases/Purchase[@Country="CA"]/URL http://tvgo.xfinity.com/watch/x/6091165CA
/Item/Purchases/Purchase[@Country="CA"]/Rating R
这篇关于如何使用属性获取lxml中所有元素的路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文