XPATH 部分匹配 tr id 与 Python、Selenium、 [英] XPATH partial match tr id with Python, Selenium,

查看:16
本文介绍了XPATH 部分匹配 tr id 与 Python、Selenium、的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

能否请我使用正确的 XPATH 来提取 tr id="review_" 元素?我设法获得了元素,但在 ID 上很幸运,因为它们是部分匹配

could I please have the correct XPATH to extract the tr id="review_" elements as well? I managed to get the elements but lucking out on the IDs as they are a partial match

<table class="admin">
<thead>"snip"</thead>
<tbody>
    <tr id="review_984669" class="">
    <td>weird_wild_and_wonderful_mammals</td>
    <td>1</td>
    <td><input type="checkbox" name="book_review[approved]" id="approved" value="1" class="attribute_toggle"></td>
    <td><input type="checkbox" name="book_review[rejected]" id="rejected" value="1" class="attribute_toggle"></td>
    <td>February 27, 2019 03:56</td>
    <td><a href="/admin/new_book_reviews/984669?page=2">Show</a></td>
    <td>
        <span class="rest-in-place" data-attribute="review" data-object="book_review" data-url="/admin/new_book_reviews/984669">
bad
        </span>
    </td>
    </tr>
    <tr id="review_984670" class="striped">

我使用 Selenium 和 Chrome 来提取页面上唯一的表格.

I used Selenium with Chrome to extract the only table on the page.

Table_Selenium_Elements = driver.find_element_by_xpath('//*[@id="admin"]/table')

然后我使用下面的方法来获取每一行的数据.

Then I was using the below to get the data from each row.

for Pri_Key, element in enumerate(Table_Selenium_Elements.find_elements_by_xpath('.//tr')):
# Create an empty secondary dict for each new Pri Key
    sec = {}
    # Secondary dictionary needs a Key. Keys are items in column_headers list
    for counter, Sec_Key in enumerate(column_headers):
        # Secondary dictionary needs Values for each key.
        # Values are individual items in each sub-list of column_data list
        # Slice the sub list with the counter to get each item
        sec[Sec_Key] = element.get_attribute('innerHTML')[counter]
    pri[Pri_Key] = sec

这只是显示每个ie中的数据"weird_wild_and_wonderful_mammals", "1"

This is only showing the data in each ie "weird_wild_and_wonderful_mammals", "1"

但我实际上也需要 tr id=review_xxx.我不知道该怎么做.id 号发生变化,因此可能是 xpath 'contains' 表达式或 xpath 'begins_with' 表达式.

BUT I actually need the tr id=review_xxx as well. I don't know how to do this. The id number changes so maybe a xpath 'contains' expression OR a xpath 'begins_with' expression.

由于我是菜鸟,我想我已经捕获了 review_ID,但我没有通过 for 循环正确提取.

Since I'm a noob I think I have captured the review_ID but I am not extracting correctly via my for loop.

有人可以告诉我正确的 XPATH 来提取父 tr 和子 tds....然后我将调整我的 for 循环.谢谢山姆

Could someone please show me the correct XPATH to extract the parent tr, and child tds. ...and then I will tweak my for loop. Thankyou Sam

推荐答案

根据您的 html 和以下选择器示例,您可以获得所有行:

Based on your html with below selectors example you can get all rows:

admin_table_rows = driver.find_elements_by_css_selector(".admin tbody > tr")
admin_table_rows = driver.find_elements_by_css_selector(".admin tr[id^='review_']")
admin_table_rows = driver.find_elements_by_xpath("//table[@class='admin']//tr[starts-with(@id,'review_')]")

要获取 id 属性,您可以使用 element.get_attribute("id") 方法.

To get id attribute you can use element.get_attribute("id") method.

这里是如何抓取数据的示例:

Here example how you can scrape data:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)

admin_table_rows = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".admin tr[id^='review_']")))

for row in admin_table_rows:
    row_id = row.get_attribute("id").replace("review_", "")
    label = row.find_element_by_css_selector("td:nth-child(1)")
    num = row.find_element_by_css_selector("td:nth-child(2)")
    date = row.find_element_by_css_selector("td:nth-child(3)")
    href = row.find_element_by_css_selector("a").get_attribute("href")

这篇关于XPATH 部分匹配 tr id 与 Python、Selenium、的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆