如何在lxml xpath中使用正则表达式? [英] How to use regular expression in lxml xpath?

查看:120
本文介绍了如何在lxml xpath中使用正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用这样的构造:

I'm using construction like this:

doc = parse(url).getroot()
links = doc.xpath("//a[text()='some text']")

但是我需要选择所有带有以某些文本"开头的文本的链接,所以我想知道这里是否可以使用regexp?在lxml文档中找不到任何内容

But I need to select all links which have text beginning with "some text", so I'm wondering is there any way to use regexp here? Didn't find anything in lxml documentation

推荐答案

您可以执行此操作(尽管示例中不需要正则表达式). Lxml支持 EXSLT 扩展函数的正则表达式. (有关 XPath类的信息,请参阅lxml文档,但它也适用于xpath()方法)

You can do this (although you don't need regular expressions for the example). Lxml supports regular expressions from the EXSLT extension functions. (see the lxml docs for the XPath class, but it also works for the xpath() method)

doc.xpath("//a[re:match(text(), 'some text')]", 
        namespaces={"re": "http://exslt.org/regular-expressions"})

请注意,您需要提供名称空间映射,以便它知道xpath表达式中的"re"前缀代表什么.

Note that you need to give the namespace mapping, so that it knows what the "re" prefix in the xpath expression stands for.

这篇关于如何在lxml xpath中使用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆