在 Python 中使用 Selenium 从某个 div 获取链接 [英] Get links from a certain div using Selenium in Python

查看:59
本文介绍了在 Python 中使用 Selenium 从某个 div 获取链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下 HTML 页面.我想获取特定 div 中的所有链接.这是我的 HTML 代码:

I have the following HTML page. I want to get all the links inside a specific div. Here is my HTML code:

<div class="rec_view">
    <a href='www.xyz.com/firstlink.html'>
        <img src='imga.png'>
    </a>
    <a href='www.xyz.com/seclink.html'>
        <img src='imgb.png'>
    </a>
    <a href='www.xyz.com/thrdlink.html'>
        <img src='imgc.png'>
    </a>
</div>

我想获取 rec_view div 上的所有链接.所以我想要的那些链接是,

I want to get all the links that are present on the rec_view div. So those links that I want are,

www.xyz.com/firstlink.html
www.xyz.com/seclink.html
www.xyz.com/thrdlink.html

这是我尝试过的 Python 代码

Here is the Python code which I tried with

from selenium import webdriver;
webpage = r"https://www.testurl.com/page/123/"
driver = webdriver.Chrome("C:chromedriver_win32chromedriver.exe")
driver.get(webpage)
element = driver.find_element_by_css_selector("div[class='rec_view']>a")
link = element.get_attribute("href")
print(link)

如何在 Python 上使用 selenium 获取这些链接?

How can I get those links using selenium on Python?

推荐答案

根据您共享的 HTML 来获取 rec_view div 上存在的所有链接的列表,您可以使用以下代码块:

As per the HTML you have shared to get the list of all the links that are present on the rec_view div you can use the following code block :

from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:chromedriver_win32chromedriver.exe')
driver.get('https://www.testurl.com/page/123/')
elements = driver.find_elements_by_css_selector("div.rec_view a")
for element in elements:
    print(element.get_attribute("href"))

注意 :因为您需要从 div 标签中收集所有 href 属性,所以而不是 find_element_* 你需要使用 find_elements_*.此外,> 指的是直接 <a> 子节点,因为您需要遍历所有 <a> 子节点,因此所需的 css_selector 将是 div.rec_view a

Note : As you need to collect all the href attributes from the div tag so instead of find_element_* you need to use find_elements_*. Additionally, > refers to immediate <a> child node where as you need to traverse all the <a> child nodes so the desired css_selector will be div.rec_view a

这篇关于在 Python 中使用 Selenium 从某个 div 获取链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆