XPath:“排除"“InnerHtml"中的标签(<a href=“">InnerHtml<span>excludeme</span></a> [英] XPath: "Exclude" tag in "InnerHtml" (<a href="">InnerHtml<span>excludeme</span></a>

查看:45
本文介绍了XPath:“排除"“InnerHtml"中的标签(<a href=“">InnerHtml<span>excludeme</span></a>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 XPath 来查询 HTML 站点,到目前为止效果很好,但现在我遇到了(砖)墙并且找不到解决方案:-)

I am using XPath to query HTML sites, which works pretty good so far, but now I hit a (brick)wall and can't find a solution :-)

html 如下所示:

<ul>
<li><a href="">Text1<span>AnotherText1</span></a></li>
<li><a href="">Text2<span>AnotherText2</span></a></li>
<li><a href="">Text3<span>AnotherText3</span></a></li>
</ul>

我想选择TextX"部分,而不是 <span></span> 中的 AnotherTextX 部分到目前为止,我无法想出任何(纯)XPath 解决方案来做到这一点(不幸的是,在我的设置中,我需要一个纯 XPath 解决方案.

I want to select the "TextX" part, but NOT the AnotherTextX part in the <span></span> So far I couldn't come up with any (pure) XPath solution to do that (and in my setup I unfortunately need a pure XPath solution.

这会选择我想要的类型,但结果是TextXAnotherTextX",而我只需要TextX".

This selects kind of what I want, but it results in "TextXAnotherTextX" and I only need "TextX".

/ul/li/a

有什么提示吗?:-)

推荐答案

这将获得 的第一个直接文本节点子节点:

This gets you the first direct text node child of <a>:

/ul/li/a/text()[1]

这会让你任何直接文本节点子节点(单独):

and this would get you any direct text node child (separately):

/ul/li/a/text()

以上都返回 "TextX",但如果你有:

Both of the above return "TextX", but if you had:

<li><a href="">Text4<span>AnotherText3</span>TrailingText</a></li>

那么后者会返回:["Text4", "TrailingText"],而前者只会返回"Text4".

then the latter would return: ["Text4", "TrailingText"], while the former would return "Text4" only.

你的表达式 /ul/li/a 得到 的字符串值,它被定义为所有子元素的字符串值的串联,所以你得到 "TextXAnotherTextX".

Your expression /ul/li/a gets the string value of <a>, which is defined as the concatenation of the string value of all the children of <a>, so you get "TextXAnotherTextX".

这篇关于XPath:“排除"“InnerHtml"中的标签(&lt;a href=“"&gt;InnerHtml&lt;span&gt;excludeme&lt;/span&gt;&lt;/a&gt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆