Xpath-如何选择相关的表亲数据 [英] Xpath - How to select related cousin data

查看:76
本文介绍了Xpath-如何选择相关的表亲数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

<html>
    <table border="1">
        <tbody>
            <tr>
                <td>
                    <table border="1">
                        <tbody>
                            <tr>
                                <th>aaa</th>
                                <th>bbb</th>
                                <th>ccc</th>
                                <th>ddd</th>
                                <th>eee</th>
                                <th>fff</th>
                            </tr>
                            <tr>
                                <td>111</td>
                                <td>222</td>
                                <td>333</td>
                                <td>444</td>
                                <td>555</td>
                                <td>666</td>
                            </tr>
                        </tbody>
                    </table>
                </td>
            </tr>
        </tbody>
    </table>
</html>

如何使用xpath选择特定的相关表亲数据,所需的输出为:

How can i select specific related cousin data using xpath, The desired output would be be:

<th>aaa</th>
<th>ccc</th>
<th>fff</th>
<td>111</td>
<td>333</th>
<td>666</td>

xpath最重要的方面是我希望能够包含或排除某些<th>标签及其对应的<td>标签

The most important aspect of the xpath is that i am looking to be able to include or exclude certain <th> tags and their corresponding <td>tags

因此,根据目前为止的答案,我最接近的是:

So based on the answers so far the closest I have is:

//th[not(contains(text(), "ddd"))] | //tr[2]/td[not(position()=4)]

有没有不使用position()=4而是引用相应的th标签的任何方式

Is there any way of not explicitly using position()=4 but instead reference the corresponding th tag

推荐答案

我不确定这是最佳解决方案,但您可以尝试

I'm not sure that this is the best solution, but you might try

//th[not(.="bbb") and not(.="ddd") and not(.="eee")] | //tr[2]/td[not(position()=index-of(//th, "bbb")) and not(position()=index-of(//th, "ddd")) and not(position()=index-of(//th, "eee"))]

或更短的版本

//th[not(.=("bbb", "ddd", "eee"))]| //tr[2]/td[not(position()=(index-of(//th, "bbb"), index-of(//th, "ddd"),index-of(//th, "eee")))]

您可以避免使用复杂的XPath表达式来获取所需的输出.尝试改用Python + Selenium功能:

You can avoid using complicated XPath expressions to get required output. Try to use Python + Selenium features instead:

# Get list of th elements
th_elements = driver.find_elements_by_xpath('//th')
# Get list of td elements
td_elements = driver.find_elements_by_xpath('//tr[2]/td')
# Get indexes of required th elements - [0, 2, 5]
ok_index = [th_elements.index(i) for i in th_elements if i.text not in ('bbb', 'ddd', 'eee')]
for i in ok_index:
    print(th_elements[i].text)
for i in ok_index:
    print(td_elements[i].text)

输出为

'aaa'
'ccc'
'fff'
'111'
'333'
'666'

如果您需要XPath 1.0解决方案:

//th[not(.=("bbb", "ddd", "eee"))]| //tr[2]/td[not(position()=(count(//th[.="bbb"]/preceding-sibling::th)+1, count(//th[.="ddd"]/preceding-sibling::th)+1, count(//th[.="eee"]/preceding-sibling::th)+1))]

这篇关于Xpath-如何选择相关的表亲数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆