lxml:获取具有特定子元素的元素? [英] lxml: get element with a particular child element?

查看:201
本文介绍了lxml:获取具有特定子元素的元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在lxml中工作时,我想获取具有title="Go to next page"img子级的所有链接的href属性.

Working in lxml, I want to get the href attribute of all links with an img child that has title="Go to next page".

因此,在以下代码段中:

So in the following snippet:

<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>

我想找回StdResults.aspx.

我已经走了这么远:

next_link = doc.xpath("//a/img[@title='Go to next page']") 
print next_link[0].attrib['href']

但是next_linkimg,而不是a标签-如何获得a标签?

But next_link is the img, not the a tag - how can I get the a tag?

谢谢.

推荐答案

只需将a/img...更改为a[img...] :(方括号中的意思是诸如此类")

Just change a/img... to a[img...]: (the brackets sort of mean "such that")

import lxml.html as lh

content='''<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>'''

doc=lh.fromstring(content)
for elt in doc.xpath("//a[img[@title='Go to next page']]"):
    print(elt.attrib['href'])

# StdResults.aspx

或者,您可以走得更远并使用

Or, you could go even farther and use

"//a[img[@title='Go to next page']]/@href"

检索href属性的值.

to retrieve the values of the href attributes.

这篇关于lxml:获取具有特定子元素的元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆