使用 XPath 获取页面上两个集合的所有交集 [英] Get all intersections of two sets on the page using XPath

查看:69
本文介绍了使用 XPath 获取页面上两个集合的所有交集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题的后续 -

或者我应该创建一个包含所有@name 值的列表并将它们替换为 preciding 和 following 兄弟?

解决方案

我不认为集合的交集"是描述这个问题的准确方法.我会将其描述为对序列进行分区".

你没有说你在寻找什么样的结果,但从表面上看,它是一个序列序列,这立即表明一个问题,即没有序列序列这样的东西在 XPath 数据模型中 - 至少在 XPath 3.1 引入数组时不会.

您没有说明您对哪个版本的 XPath 感兴趣,但是您将问题标记为Python"这一事实暗示它可能是 XPath 1.0.如果是这种情况,那么我认为最好的解决方案几乎肯定是将整个输入序列拉入 Python 并在那里进行分区.

FWIW,在 XPath 3.1 中,您可以创建一个映射,将诸如 dst100003 之类的键映射到紧跟在相关 apre 元素> 元素:

map:merge(for $a in child::a返回地图{$a!@name,$a!following-sibling::pre[preceding-sibling::a[1] 是 $a]})

然而,它的性能可能为 O(n^2),并且使用 XQuery 3.1 group-by(或 XSLT for-each-group)的解决方案几乎肯定会表现得更好.

Follow-up from this question - Xpath. How to select all text between two tags?

I can get text from in between one intersect like this -

response.xpath('//pre[preceding-sibling::a[@name="dst100030"] and following-sibling::a[@name="dst100031"]]//text()')

The page has a list of such intersections and I need to get text from in between each of them. Is there such option using xpath?

Or i should create a list of all @name values and substitute them into preciding and following sibling?

解决方案

I don't think "intersections of sets" is an accurate way of characterizing this problem. I would describe it as "partitioning a sequence".

You don't say what kind of result you are looking for, but on the face of it, it's a sequence of sequences, and that immediately signals a problem, which is that there is no such thing as a sequence of sequences in the XPath data model - at least not until XPath 3.1, when arrays are introduced.

You don't say what version of XPath you are interested in, but the fact that you've tagged the question "Python" hints that it might be XPath 1.0. If that's the case then I think the best solution is almost certainly to pull the whole input sequence into Python and do the partitioning there.

FWIW, in XPath 3.1 you can create a map that maps a key such as dst100003 to the pre elements that immediately follow the relevant a element with:

map:merge(for $a in child::a 
          return map{$a!@name, 
            $a!following-sibling::pre[preceding-sibling::a[1] is $a]})

It's likely to have O(n^2) performance, however, and a solution using XQuery 3.1 group-by (or XSLT for-each-group) would almost certainly perform better.

这篇关于使用 XPath 获取页面上两个集合的所有交集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆