获取数据的 xpath 以特定字符或字符串开头 [英] xpath to get data starts with specific character or string

查看:22
本文介绍了获取数据的 xpath 以特定字符或字符串开头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从以下代码中提取某些文本元素.

要提取电话:+49 231 447687,我可以使用 div[@class='inhalt-links']/text()[4].对于传真、电子邮件、网站等其他详细信息,我只需要更改 text() 元素的位置编号.但是,这些文本的位置有时会有不同的顺序,例如以下代码:

xpath div[@class='inhalt-links']/text()[4] 将选择文本44041 Dortmund"而不是电话:+49 231 544-0.有没有像 "div[@class='inhalt-links']/text[starts with "Tel.:"]" 这样的 xpath 来选择 Tel.: 元素?

解决方案

" 有没有像 "//div[@class='inhalt-links']/text[starts with "Tel.:"]" 这样的 xpath 来选择 电话:元素?"

当然,试试这个:

//div[@class='inhalt-links']/text()[starts-with(normalize-space(), 'Tel.:')]

XPath 返回文本节点——而不是元素——在删除前导和尾随空格后*,关键字Tel.:.

<小时>

*) 参考 normalize-space() 做的更精确:

<块引用>

normalize-space 函数从字符串中去除前导和尾随空格,用单个空格替换空格字符序列,并返回结果字符串.[Mozilla 开发者网络]

I need to extract certain text elements from the following code.

<div class="inhalt-links">
    <h2>
        Deutsche Verkehrswacht
        <br>
        Verkehrswacht Dortmund e. V.
        <br>
    </h2>
    <h3>
        Standnummer:&nbsp;
            <span style="font-weight: normal;">4.E08</span>
    </h3>
    <div class="clear"></div>
    <br>
    Benediktinerstraße 82
    <br>
    44287&nbsp;Dortmund
    <br>
    Deutschland
    <br>
    <br>
    Tel.:+49 231 447687
    <br>
    Fax:+49 231 447136
    <br>
    E-Mail:info@verkehrswacht-dortmund.de
    <br>
    <a href="http://www.verkehrswacht-dortmund.de" class="url" target="_blank">www.verkehrswacht-dortmund.de</a>
    <br>
    <div class="social"></div>
    <br>
</div>

For extracting the Tel.:+49 231 447687, i can use div[@class='inhalt-links']/text()[4]. And for other details like Fax, Email, Website, i just need to change the position number of text() element. But, the position of these texts will be of different order sometimes, like in the following code:

<div class="inhalt-links">
    <h2>
        DEW21
        <br>
    </h2>
    <h3>
        Standnummer:&nbsp;
            <span style="font-weight: normal;">4.B56</span>
    </h3>
    <div class="clear"></div>
    <br>
    Günter-Samtlebe-Platz 1
    <br>
    44135&nbsp;Dortmund
    <br>
    Postfach:104141
    <br>
    44041&nbsp;Dortmund
    <br>
    Deutschland
    <br>
    <br>
    Tel.:+49 231 544-0
    <br>
    Fax:+49 231 544-1130
    <br>
    E-Mail:vertrieb@dew21.de
    <br>
    <a href="http://www.dew21.de" class="url" target="_blank">www.dew21.de</a>
    <br>
    <div class="social"></div>
    <br>
</div>

The xpath div[@class='inhalt-links']/text()[4] will select the text "44041 Dortmund" instead of Tel.:+49 231 544-0. Is there any xpath like "div[@class='inhalt-links']/text[starts with "Tel.:"]" to select the Tel.:element?

解决方案

" Is there any xpath like "//div[@class='inhalt-links']/text[starts with "Tel.:"]" to select the Tel.: element?"

Sure, try this way :

//div[@class='inhalt-links']/text()[starts-with(normalize-space(), 'Tel.:')]

The XPath returns text node -rather than element- that starts with, after removing leading and trailing whitespaces*, the keyword Tel.:.


*) For reference of what normalize-space() is doing more precisely :

The normalize-space function strips leading and trailing white-space from a string, replaces sequences of whitespace characters by a single space, and returns the resulting string. [Mozilla Developer Network]

这篇关于获取数据的 xpath 以特定字符或字符串开头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆