你如何使用 EC.presence_of_element_located((By.ID, “myDynamicElement")) 除了指定类而不是 ID [英] How do you use EC.presence_of_element_located((By.ID, "myDynamicElement")) except to specify class not ID
问题描述
我正在尝试使用 Python 来抓取一个网站,该网站通过使用嵌入的 javascript 文件将数据作为响应呈现到 HTML 中来动态加载它的 HTML.因此,如果我单独使用 BeautifulSoup,我将无法检索我需要的数据,因为我的程序会在 Javascript 加载数据之前抓取它.因此,我将 selenium 库集成到我的代码中,以使我的程序在抓取网站之前等待找到某个元素.
I am trying to use Python to web scrape a website that loads it's HTML dynamically by using embedded javascript files that render the data as a Response into the HTML. Therefore, if I use BeautifulSoup alone, I will not be able to retrieve that data that I need as my program will scrape it before the Javascript loads the data. Due to this, I am integrating the selenium library into my code, to make my program wait until a certain element is found before it scrapes the website.
我最初是这样做的:
element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.ID, "tabla_evolucion")))
但我想通过执行以下操作来指定一个类:
But I want to specify a class instead by doing something like:
element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.class, "ng-binding ng-scope")))
这是我的其余代码:
driver_path = 'C:/webDrivers/chromedriver.exe'
driver = webdriver.Chrome(executable_path=driver_path)
driver.header_overrides = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36'}
url = "myurlthatIamscraping.com"
response = driver.get(url)
html = driver.page_source
characters = len(html)
element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.class, "ng-binding ng-scope")))
print(html)
print(characters)
time.sleep(10)
driver.quit()
它对我不起作用,我在任何地方都找不到正确的语法.
It is not working for me and I can not find the right syntax anywhere.
推荐答案
相关的 HTML 会帮助我们构建一个更规范的答案.但是,从您的第一行代码开始:
The relevant HTML would have helped us to construct a more canonical answer. However to start with your first line of code:
element = WebDriverWait(driver,100).until(EC.presence_of_element_located(
(By.ID, "tabla_evolucion")))
在第二行代码中几乎是合法的:
is pretty much legitimate where as the second line of code:
element = WebDriverWait(driver,100).until(EC.presence_of_element_located(
(By.class, "ng-binding ng-scope")))
会引发如下错误:
消息:无效选择器:不允许使用复合类名
Message: invalid selector: Compound class names not permitted
因为你不能通过 By.class
传递多个类.
as you can't pass multiple classes through By.class
.
您可以在 无效选择器:不允许将 find_element_by_class_name 与 Webdriver 和 Python 结合使用的复合类名
解决方案
您需要注意以下几点:
Solution
You need to take care of a couple of things as follows:
- 对您的用例没有任何可见性,在功能上将 WebDriverWait 与 EC 关联起来作为
presence_of_element_located()
仅确认其中的元素存在DOM 树.大概继续前进,要么您需要获得属性,例如value
、innerText
等,否则您将与元素进行交互.因此,您需要使用visibility_of_element_located()
或element_to_be_clickable()
而不是
presence_of_element_located()
- Without any visibility to your usecase, functionally inducing WebDriverWait in association with EC as
presence_of_element_located()
merely confirms the presence of the element within the DOM Tree. Presumably moving ahead either you need to get the attributes e.g.value
,innerText
, etc or you would interact with the element. So instead ofpresence_of_element_located()
you need to use eithervisibility_of_element_located()
orelement_to_be_clickable()
为了获得最佳结果,您可以组合
ID
和CLASS
属性,并且您可以使用以下任一定位器策略:For an optimum result you can club up the
ID
andCLASS
attributes and you can use either of the following Locator Strategies:使用
CSS_SELECTOR
:element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located( (By.CSS_SELECTOR, ".ng-binding.ng-scope#tabla_evolucion")))
- 使用
XPATH
:
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located( (By.XPATH, "//*[@class='ng-binding ng-scope' and @id='tabla_evolucion']")))
这篇关于你如何使用 EC.presence_of_element_located((By.ID, “myDynamicElement")) 除了指定类而不是 ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- 使用