XPATH 部分匹配 tr id 与 Python、Selenium、 [英] XPATH partial match tr id with Python, Selenium,
问题描述
能否请我使用正确的 XPATH 来提取 tr id="review_" 元素?我设法获得了元素,但在 ID 上很幸运,因为它们是部分匹配
could I please have the correct XPATH to extract the tr id="review_" elements as well? I managed to get the elements but lucking out on the IDs as they are a partial match
<table class="admin">
<thead>"snip"</thead>
<tbody>
<tr id="review_984669" class="">
<td>weird_wild_and_wonderful_mammals</td>
<td>1</td>
<td><input type="checkbox" name="book_review[approved]" id="approved" value="1" class="attribute_toggle"></td>
<td><input type="checkbox" name="book_review[rejected]" id="rejected" value="1" class="attribute_toggle"></td>
<td>February 27, 2019 03:56</td>
<td><a href="/admin/new_book_reviews/984669?page=2">Show</a></td>
<td>
<span class="rest-in-place" data-attribute="review" data-object="book_review" data-url="/admin/new_book_reviews/984669">
bad
</span>
</td>
</tr>
<tr id="review_984670" class="striped">
我使用 Selenium 和 Chrome 来提取页面上唯一的表格.
I used Selenium with Chrome to extract the only table on the page.
Table_Selenium_Elements = driver.find_element_by_xpath('//*[@id="admin"]/table')
然后我使用下面的方法来获取每一行的数据.
Then I was using the below to get the data from each row.
for Pri_Key, element in enumerate(Table_Selenium_Elements.find_elements_by_xpath('.//tr')):
# Create an empty secondary dict for each new Pri Key
sec = {}
# Secondary dictionary needs a Key. Keys are items in column_headers list
for counter, Sec_Key in enumerate(column_headers):
# Secondary dictionary needs Values for each key.
# Values are individual items in each sub-list of column_data list
# Slice the sub list with the counter to get each item
sec[Sec_Key] = element.get_attribute('innerHTML')[counter]
pri[Pri_Key] = sec
这只是显示每个ie中的数据"weird_wild_and_wonderful_mammals", "1"
This is only showing the data in each ie "weird_wild_and_wonderful_mammals", "1"
但我实际上也需要 tr id=review_xxx.我不知道该怎么做.id 号发生变化,因此可能是 xpath 'contains' 表达式或 xpath 'begins_with' 表达式.
BUT I actually need the tr id=review_xxx as well. I don't know how to do this. The id number changes so maybe a xpath 'contains' expression OR a xpath 'begins_with' expression.
由于我是菜鸟,我想我已经捕获了 review_ID,但我没有通过 for 循环正确提取.
Since I'm a noob I think I have captured the review_ID but I am not extracting correctly via my for loop.
有人可以告诉我正确的 XPATH 来提取父 tr 和子 tds....然后我将调整我的 for 循环.谢谢山姆
Could someone please show me the correct XPATH to extract the parent tr, and child tds. ...and then I will tweak my for loop. Thankyou Sam
推荐答案
根据您的 html 和以下选择器示例,您可以获得所有行:
Based on your html with below selectors example you can get all rows:
admin_table_rows = driver.find_elements_by_css_selector(".admin tbody > tr")
admin_table_rows = driver.find_elements_by_css_selector(".admin tr[id^='review_']")
admin_table_rows = driver.find_elements_by_xpath("//table[@class='admin']//tr[starts-with(@id,'review_')]")
要获取 id
属性,您可以使用 element.get_attribute("id")
方法.
To get id
attribute you can use element.get_attribute("id")
method.
这里是如何抓取数据的示例:
Here example how you can scrape data:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
admin_table_rows = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".admin tr[id^='review_']")))
for row in admin_table_rows:
row_id = row.get_attribute("id").replace("review_", "")
label = row.find_element_by_css_selector("td:nth-child(1)")
num = row.find_element_by_css_selector("td:nth-child(2)")
date = row.find_element_by_css_selector("td:nth-child(3)")
href = row.find_element_by_css_selector("a").get_attribute("href")
这篇关于XPATH 部分匹配 tr id 与 Python、Selenium、的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!