我无法从确实的特定时间段内抓取每个链接内容 [英] I am unable to scrape each link content for specific time period from indeed

查看：24 发布时间：2021/9/24 18:59:57 python selenium selenium-webdriver web-scraping selenium-chromedriver

本文介绍了我无法从确实的特定时间段内抓取每个链接内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是 Python 和网页抓取的新手.您的帮助将不胜感激.我是编程和练习的新手.我正在使用 python 和 selenium 进行网页抓取

I am new to python and web scraping. Your help will be appreciated. I am newbie in programming and practicing . i am using python and selenium for web scraping

我正试图从中抓取数据.目标是找到过去 24 小时内发布的所有职位，并抓取职位详细信息页面上提供的外部链接，链接文本为在公司网站上申请"、标题、公司、姓名、地点、职位描述.

I am trying to scrape the data from indeed. goal is to find all jobs posted in last 24 hour and scrape the external link which is available on job detail page with link text "Apply on company site", Heading, company, name, location, Job description.

我编写了以下代码，但它正确获取了页面上的所有链接，然后当我尝试打开每个链接时，它只打开第一个链接.如何打开我一一获取的所有链接.提前致谢，这是我的代码示例:

i write following code but it is fetching all links on the page correctly and then when i try to open the each link it is only opening first link. How can i open all links which i fetch one by one. Thanks in advance , here is my code sample:

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

Path = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(Path)

driver.get("https://indeed.ae/")
print(driver.title)
search = driver.find_element_by_name("l")
search.send_keys("Dubai")
search.send_keys(Keys.RETURN)

try:
    td = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "resultsCol"))
    )
    divs = td.find_elements_by_tag_name("div")

    for div in divs:
        try:
            title = div.find_element_by_class_name("title")
            anchors = title.find_elements_by_tag_name('a')
            links = []
            for anchor in anchors:
                link = anchor.get_attribute('href')
                links.append(link)
                print(links)
                for link in links:
                    url = driver.get(link)
        except:
            continue

finally:
    driver.quit()


driver.quit()

我无法从确实的特定时间段内抓取每个链接内容 [英] I am unable to scrape each link content for specific time period from indeed

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

我无法从确实的特定时间段内抓取每个链接内容 [英] I am unable to scrape each link content for specific time period from indeed

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭