如何从< a>中提取所有文本通过Python使用Selenium标记 [英] How to extract all the texts from <a> tag using Selenium through Python

查看：193 发布时间：2020/8/10 19:23:22 python selenium xpath css-selectors webdriverwait

本文介绍了如何从< a>中提取所有文本通过Python使用Selenium标记的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我要从中提取数据的网站的链接，我正在尝试在锚标签下获取href属性的所有文本. 这是示例html:

Here is the link of website from where I want to extract data, I'm trying to get all text of href attribute under anchor tag. Here is the sample html:

<div id="borderForGrid" class="border">
  <h5 class="">
    <a href="/products/product-details/?prod=30AD">A/D TC-55 SEALER</a>
  </h5>

<div id="borderForGrid" class="border">
  <h5 class="">
    <a href="/products/product-details/?prod=P380">Carbocrylic 3356-1</a>
 </h5>

我想提取所有文本值，例如['A/D TC-55 SEALER','Carbocrylic 3356-1'].
我尝试过:

I want to extract all text values like ['A/D TC-55 SEALER','Carbocrylic 3356-1'].
I tried with:

target = driver.find_element_by_class_name('border')
anchorElement = target.find_element_by_tag_name('a')
anchorElement.text

但它给出''(空)字符串.

but it gives '' (empty) string.

关于如何实现的任何建议?

Any suggestion on how can it be achieved?

PS-在产品类型

推荐答案

要提取<a>标记内的所有文本值，例如 ['A/D TC-55 SEALER'，'Carbocrylic 3356-1'] ，您必须为visibility_of_all_elements_located()引入 WebDriverWait ，并且您可以使用以下解决方案:

To extract all the text values within the <a> tags e.g. ['A/D TC-55 SEALER','Carbocrylic 3356-1'], you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following solutions:

使用CSS_SELECTOR:

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "li.topLevel[data-types='Acrylics'] h5>a[href^='/products/product-details/?prod=']")))])

使用XPATH:

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//li[@class='topLevel' and @data-types='Acrylics']//h5[@class]/a[starts-with(@href, '/products/product-details/?prod=')]")))])

注意:您必须添加以下导入:

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

这篇关于如何从< a>中提取所有文本通过Python使用Selenium标记的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从< a>中提取所有文本通过Python使用Selenium标记 [英] How to extract all the texts from <a> tag using Selenium through Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从&lt; a&gt;中提取所有文本通过Python使用Selenium标记 [英] How to extract all the texts from &lt;a&gt; tag using Selenium through Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

如何从< a>中提取所有文本通过Python使用Selenium标记 [英] How to extract all the texts from <a> tag using Selenium through Python

登录关闭