XPath - 如何从一个文本节点中提取文本的特定部分 [英] XPath - How to extract specific part of the text from one text node

查看:56
本文介绍了XPath - 如何从一个文本节点中提取文本的特定部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只想从 td 中提取文本的一部分,例如FLAC".如何使用 XPath 完成?

I would like to extract only the part of the text from td, for example "FLAC". How can it be done using XPath?

我尝试过//text()[contains(., 'FLAC')],但它返回了整个文本.

I've tried //text()[contains(., 'FLAC')], but it returns me the whole text.

                    <tr>
                        <td class="left">Format plików</td>
                        <td>
                                                                AVI, FLV, RM, RMVB, FLAC, APE, AAC, MP3, WMA, OGG, BMP, GIF, TXT, JPEG, MOV, MKV, DAT, DivX, XviD, MP4, VOB
                                                        </td>
                    </tr>

推荐答案

您必须首先指定树中的位置,并且由于您有多个 元素,因此您首先要指定找到包含文本的节点.

You'll have to specify where in your tree first, and since you have multiple <td> elements you first want to find the node containing the text.

substring(//tr/td[contains(@class, 'left')]/following-sibling::text()[1], startIndex, length)

substring(//tr/td[@class='left']/following-sibling::text()[1], startIndex, length)

根据评论更新:

电汇contains(//tr/td[@class='left']/following-sibling::text()[1], 'FLAC')

T/F contains(//tr/td[@class='left']/following-sibling::text()[1], 'FLAC')

这将为您提供兄弟元素的 T/F,其后带有FLAC"一词.您可以使用 substring() 来获取该字符串的一个子集,但这仅适用于静态情况.我建议使用不同的方法(例如 XSLT)来更改/分隔字符串.希望这会有所帮助!

This will give you the T/F for the sibling element after which has the word "FLAC." You could use substring() to grab a subset of that string, but that's only in static cases. I'd suggest using a different method such as XSLT to alter/separate the string. Hope this helps!

更新 2

substring('FLAC',1,4*contains(//tr/td[@class='left']/following-sibling::text()[1], 'FLAC'))

这将返回 FLAC,如果 FLAC 存在于您正在检查的节点中,如果不存在则为空白....

this will return FLAC, if FLAC is present in the node you're inspecting, and blank if not....

分步分解:

  1. //tr/td[@class='left'] - 这将返回 ALL 个节点将属性class"设置为left"

  1. //tr/td[@class='left'] - This returns ALL <td> nodes which have an attribute "class" set to "left"

/following-sibling::text() - 返回上面节点之后的所有节点的文本.

/following-sibling::text() - This returns all nodes' text after the node above.

添加 [1] 返回上面列表中的第一个节点.

Adding [1] returns the first node from the list above.

将其包装在 contains(aboveValue, 'FLAC') 中将返回 TRUE(或在此示例中为 1),如果文本中存在 'FLAC',否则返回 False(0).

Wrapping this in contains(aboveValue, 'FLAC') will return TRUE(or 1, in this example), if 'FLAC' is present in the text, and False(0) if it is not.

将所有这些包装在 substring('FLAC',1,4*aboveValue) 中相当于 XPath 1.0 中的 If/Then/Else,因为没有内置函数可以这样做:如果存在 'FLAC',则拉取子字符串 1,4*(true=1)=4,即整个字符串.如果 'FLAC' 不存在,则拉取子字符串 1,4*(false=0)=0,它不是字符串.

Wrapping all of this in substring('FLAC',1,4*aboveValue) is the equivalent of an If/Then/Else in XPath 1.0, since there isn't a built-in function to do so: If 'FLAC' is present, pull the substring 1,4*(true=1)=4, which is the whole string. If 'FLAC' is not present, pull the substring 1,4*(false=0)=0, which is none of the string.

另一件需要注意的事情,contains() 是区分大小写的,所以如果这个字段可以有flac",它会返回 false.要检查 FLAC 的所有大小写混合,请使用 translate(),此处的示例.

Another thing to note, contains() is case-sensitive so if this field can have "flac," it will return false. To check for all case mixes of FLAC, use translate(), example here.

这篇关于XPath - 如何从一个文本节点中提取文本的特定部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆