Python:Selenium,通用 XPATH 上的 NoSuchElementException [英] Python: Selenium, NoSuchElementException on generic XPATH
问题描述
我有一个代码,可以让我在给定关键字的情况下从特定网站返回所有搜索到的部分.
当使用搜索词HL4RPV-50"时,我可以按预期获取所有返回值.
当我使用搜索词FSJ4-50B"时,我得到一个 NoSuchElementException
的行:
--->53 price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]
直接的 XPATH 是:
//*[@id="search"]/div[3]/div[2]/div[2]/div[2]/div[6]/div[2]/div[1]/div[1]/div/div[4]/div/add-product-to-cart/div[1]
对于两个部件 ID,这不是相同的直接 XPATH.此外,根据给定结果的零件位置,每个零件 ID 的 XPATH 略有不同.
我的印象是我可以参考一个相对的 XPATH 来解决这个问题.
我试图抓取的网站是
正如您在上面的屏幕截图中看到的,我将鼠标悬停在 searchBar 上,然后知道它有一个 ID,我们知道 ID> 始终是网页上的唯一元素,因此我们也可以使用:
driver.find_element_by_id("searchBar")
但是为了到达输入字段,我更喜欢css_selector然后发送密钥.
用于查找 a.inputButton
css 选择器:
对于 a.button
css 选择器,请参见选择 searchButton 您将在 dom 中看到以下 html:
<a class="CoveoSearchButton inputButton button"><span class="coveo-icon">Search</span><i class="fa fa-search" aria-hidden=真"></i></a>
并且我们知道是锚标签,从上面的html,我们可以推断出css_selector之一可以是:
a.inputButton
注意
但是这里是唯一的,在这种情况下有时同一个类名可以在同一页面上的不同元素中多次使用,因此您必须使用上层节点来到达子CSS元素 节点.例如,a.inputButton
也可以遍历为:
searchButton 的另一个 css_selector
div.divCoveoSearchbox >a.输入按钮
as div
是 inputButton 的锚标签的父元素.
我希望我明白你的意思吗?
I have code which allows me to return all searched parts from a specific website, given a keyword.
When the search term "HL4RPV-50" is used, I can get back all returned values as expected.
When I use the search term "FSJ4-50B", I get a NoSuchElementException
for the line:
---> 53 price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]
The direct XPATH is:
//*[@id="search"]/div[3]/div[2]/div[2]/div[2]/div[6]/div[2]/div[1]/div[1]/div/div[4]/div/add-product-to-cart/div[1]
Which is NOT the same direct XPATH for both part ID's. Furthermore, each part ID has a slightly different XPATH based on the position of the part given it's results.
I was under the impression I can reference a relative XPATH to resolve this issue.
The site I am trying to scrape from is Tessco.com and a generic UN/PW is specified in the code below.
Identifying the XPATH ID:
To make a generic XPATH, I was under the impression to use a .
to select the current node, and //
to select nodes in the document from the current node that match the selection no matter where they are.
I then specified its type, here it's div
and then @class='price'
For "HL4RPV-50" this gives me what I want, for "FSJ4-50B" it does not.
I belive I have the wrong XPATH, but unsure of how to generalize it.
Any suggestions?
The Code:
import time
#Need Selenium for interacting with web elements
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
#Need numpy/pandas to interact with large datasets
import numpy as np
import pandas as pd
chrome_path = r"C:\Users\James\Documents\Python Scripts\jupyterNoteBooks\ScrapingData\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.tessco.com/login")
userName = "FirstName.SurName321123@gmail.com"
password = "PasswordForThis123"
#Set a wait, for elements to load into the DOM
wait10 = WebDriverWait(driver, 10)
wait20 = WebDriverWait(driver, 20)
wait30 = WebDriverWait(driver, 30)
elem = wait10.until(EC.element_to_be_clickable((By.ID, "userID")))
elem.send_keys(userName)
elem = wait10.until(EC.element_to_be_clickable((By.ID, "password")))
elem.send_keys(password)
#Press the login button
driver.find_element_by_xpath("/html/body/account-login/div/div[1]/form/div[6]/div/button").click()
#Expand the search bar
searchIcon = wait10.until(EC.element_to_be_clickable((By.XPATH, "/html/body/header/div[2]/div/div/ul/li[2]/i")))
searchIcon.click()
searchBar = wait10.until(EC.element_to_be_clickable((By.XPATH, '/html/body/header/div[3]/input')))
searchBar.click()
#load in manufacture part number from a collection of components, via an Excel file
#Enter information into the search bar
searchBar.send_keys("FSJ4-50B" + '\n')
# wait for the products information to be loaded
products = wait30.until(EC.presence_of_all_elements_located((By.XPATH,"//div[@class='CoveoResult']")))
# create a dictionary to store product and price
productInfo = {}
# iterate through all products in the search result and add details to dictionary
for product in products:
# get product name
productName = product.find_element_by_xpath(".//a[@class='productName CoveoResultLink hidden-xs']").text
# get price
price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]
# add details to dictionary
productInfo[productName] = price
# print products information
print(productInfo)
#time.sleep(5)
driver.close()
This is the working code I disabled the images because My Internet connection was slow and the website was taking time to load the page. I used css selector instead xPath for price and its fully working>
import time
#Need Selenium for interacting with web elements
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
#Need numpy/pandas to interact with large datasets
import numpy as np
import pandas as pd
chrome_path = r".\web_driver\chromedriver.exe"
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_path, chrome_options=chrome_options)
driver.maximize_window()
driver.get("https://www.tessco.com/login")
userName = "FirstName.SurName321123@gmail.com"
password = "PasswordForThis123"
#Set a wait, for elements to load into the DOM
wait10 = WebDriverWait(driver, 10)
wait20 = WebDriverWait(driver, 20)
wait30 = WebDriverWait(driver, 30)
elem = wait10.until(EC.element_to_be_clickable((By.ID, "userID")))
elem.send_keys(userName)
elem = wait10.until(EC.element_to_be_clickable((By.ID, "password")))
elem.send_keys(password)
#Press the login button
driver.find_element_by_xpath("/html/body/account-login/div/div[1]/form/div[6]/div/button").click()
#Expand the search bar
# searchIcon = wait10.until(EC.element_to_be_clickable((By.XPATH, "")))
# searchIcon.click()
searchBar = wait10.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#searchBar input")))
#Enter information into the search bar
searchBar.send_keys("FSJ4-50B")
driver.find_element_by_css_selector('a.inputButton').click()
time.sleep(5)
# wait for the products information to be loaded
products = driver.find_elements_by_xpath( "//div[@class='CoveoResult']")
# create a dictionary to store product and price
productInfo = {}
# iterate through all products in the search result and add details to dictionary
for product in products:
# get product name
productName = product.find_element_by_xpath("//a[@class='productName CoveoResultLink hidden-xs']").text
# get price
price = product.find_element_by_css_selector("div.price").text.split('\n')[1]
# add details to dictionary
productInfo[productName] = price
# print products information
print(productInfo)
#time.sleep(5)
driver.close()
Output:
{"8' Jumper-FSJ4-50B NM/NM": '$147.55'}
Edited:
How to choose selector
As you can see in the above screenshot, I hover over the searchBar and come to know this have an ID, and we know ID is always unique element on the webpage so we can also use:
driver.find_element_by_id("searchBar")
but to reach to the input field I prefer css_selector and then send keys.
For finding a.inputButton
css selector:
For a.button
css selector see select the searchButton you will see the following html in the dom:
<a class="CoveoSearchButton inputButton button"><span class="coveo-icon">Search</span><i class="fa fa-search" aria-hidden="true"></i></a>
and we know <a>
is the anchor tag, and from above html, we can deduce that one of the css_selector can be:
a.inputButton
NOTE
But this is unique here in this case sometimes the same class name can be used multiple time in different elements on the same page,so then you have to use the upper level of nodes to reach the child CSS element node. e.g, the a.inputButton
can also be traversed as:
another css_selector for searchButton
div.divCoveoSearchbox > a.inputButton
as div
is the parent element to our inputButton's anchor Tag.
I hope I've clear your point?
这篇关于Python:Selenium,通用 XPATH 上的 NoSuchElementException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!