你如何使用 EC.presence_of_element_located((By.ID, “myDynamicElement")) 除了指定类而不是 ID [英] How do you use EC.presence_of_element_located((By.ID, "myDynamicElement")) except to specify class not ID

查看:973
本文介绍了你如何使用 EC.presence_of_element_located((By.ID, “myDynamicElement")) 除了指定类而不是 ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Python 来抓取一个网站,该网站通过使用嵌入的 javascript 文件将数据作为响应呈现到 HTML 中来动态加载它的 HTML.因此,如果我单独使用 BeautifulSoup,我将无法检索我需要的数据,因为我的程序会在 Javascript 加载数据之前抓取它.因此,我将 selenium 库集成到我的代码中,以使我的程序在抓取网站之前等待找到某个元素.

I am trying to use Python to web scrape a website that loads it's HTML dynamically by using embedded javascript files that render the data as a Response into the HTML. Therefore, if I use BeautifulSoup alone, I will not be able to retrieve that data that I need as my program will scrape it before the Javascript loads the data. Due to this, I am integrating the selenium library into my code, to make my program wait until a certain element is found before it scrapes the website.

我最初是这样做的:

element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.ID, "tabla_evolucion")))

但我想通过执行以下操作来指定一个类:

But I want to specify a class instead by doing something like:

element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.class, "ng-binding ng-scope")))  

这是我的其余代码:

driver_path = 'C:/webDrivers/chromedriver.exe'
driver = webdriver.Chrome(executable_path=driver_path)
driver.header_overrides = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36'}
url = "myurlthatIamscraping.com" 
response = driver.get(url)
html = driver.page_source
characters = len(html)
element = WebDriverWait(driver,100).until(EC.presence_of_element_located((By.class, "ng-binding ng-scope")))

print(html)
print(characters)
time.sleep(10)
driver.quit()

它对我不起作用,我在任何地方都找不到正确的语法.

It is not working for me and I can not find the right syntax anywhere.

推荐答案

相关的 HTML 会帮助我们构建一个更规范的答案.但是,从您的第一行代码开始:

The relevant HTML would have helped us to construct a more canonical answer. However to start with your first line of code:

element = WebDriverWait(driver,100).until(EC.presence_of_element_located(
  (By.ID, "tabla_evolucion")))

在第二行代码中几乎是合法的:

is pretty much legitimate where as the second line of code:

element = WebDriverWait(driver,100).until(EC.presence_of_element_located(
  (By.class, "ng-binding ng-scope")))

会引发如下错误:

消息:无效选择器:不允许使用复合类名

Message: invalid selector: Compound class names not permitted

因为你不能通过 By.class 传递多个类.

as you can't pass multiple classes through By.class.

您可以在 无效选择器:不允许将 find_element_by_class_name 与 Webdriver 和 Python 结合使用的复合类名


解决方案

您需要注意以下几点:


Solution

You need to take care of a couple of things as follows:

  • 对您的用例没有任何可见性,在功能上将 WebDriverWaitEC 关联起来作为 presence_of_element_located() 仅确认其中的元素存在DOM 树.大概继续前进,要么您需要获得属性,例如valueinnerText 等,否则您将与元素进行交互.因此,您需要使用 visibility_of_element_located()element_to_be_clickable()
  • 而不是 presence_of_element_located()
  • Without any visibility to your usecase, functionally inducing WebDriverWait in association with EC as presence_of_element_located() merely confirms the presence of the element within the DOM Tree. Presumably moving ahead either you need to get the attributes e.g. value, innerText, etc or you would interact with the element. So instead of presence_of_element_located() you need to use either visibility_of_element_located() or element_to_be_clickable()

您可以在WebDriverWait 未按预期工作

  • 为了获得最佳结果,您可以组合 IDCLASS 属性,并且您可以使用以下任一定位器策略:

    • For an optimum result you can club up the ID and CLASS attributes and you can use either of the following Locator Strategies:

      使用CSS_SELECTOR:

        element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located(
          (By.CSS_SELECTOR, ".ng-binding.ng-scope#tabla_evolucion")))
      

      • 使用XPATH:
      •   element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located(
            (By.XPATH, "//*[@class='ng-binding ng-scope' and @id='tabla_evolucion']")))
        

        这篇关于你如何使用 EC.presence_of_element_located((By.ID, “myDynamicElement")) 除了指定类而不是 ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆