python&Selenium:遍历WebElements列表错误:StaleElementReferenceException [英] Python & Selenium: Iterate through list of WebElements Error: StaleElementReferenceException

查看:75
本文介绍了python&Selenium:遍历WebElements列表错误:StaleElementReferenceException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下午好,

Python和网络抓取有些新功能,所以任何帮助将不胜感激!首先:

Somewhat new to Python and webscraping, so any help would be greatly appreciated! First:

from selenium import webdriver
import time 

chrome_path = r"/Users/ENTER/Desktop/chromedriver"

driver = webdriver.Chrome(chrome_path)

site_url = 'https://www.home-school.com/groups/'

driver.get(site_url)

# get state links from sidebar and store to list
area = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
items = area.find_elements_by_tag_name('a')

# remove unneeded links
del items[:22]
del items[-1:]

# 
for links in items:
    # print(links.text)
    print(links.get_attribute("href"))
    # add link related logic here
    links.click()
    # you have to wait for the next element to display
    time.sleep(4)
    # assign html container with desired data to variable
    element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[4]/div""")
    # Store container text in variable. We skip the first 5 lines of text as they 
    #  are unnecessary.
    orgdata = element.text.split("\n",5)[5]
    orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
    # Write data to text file
    filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + links.text + '.txt'
    file_object = open(filepath, 'a')
    file_object.write(orgdata)

问题

我正在使用Selenium,试图从 http://home保存家庭学校团体的名称和信息-school.com/groups/到每个州的单个文本文件.

The Problem

I am using Selenium in an attempt to save the names and information of homeschool groups from http://home-school.com/groups/ to individual text files per state.

为此,我保存了一个链接列表,并希望遍历该列表以单击每个链接,执行与抓取所需数据,操作文本以及按状态输出到单独的文本文件有关的任务.

To do this, I have saved a list of links and would like to iterate through the list to click each link, perform tasks related to scraping the desired data, manipulating the text, and outputting to separate text files per state.

尝试执行"for"循环时,我得到 StaleElementReferenceException:失效元素引用:元素未附加到页面文档.

I am getting StaleElementReferenceException: stale element reference: element is not attached to the page document when attempting to performing the "for" Loop.

我相信当它到达 element = driver.find_element_by_xpath(""/html/body/center/table/tbody/tr/td/table [3]/tbody/tr/td [2]/div").据我所知,这个xpath不会改变.我以为我需要让网络驱动程序等待页面加载,因此是 time.sleep(4).

I believe it is giving the error when it gets to element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div"""). As far as I can tell, this xpath does not change. I assumed I needed to make the webdriver wait for the page to load, hence time.sleep(4).

我确定这是一个简单的修复程序,当我看到它时会很有意义,但此刻我很沮丧.所有人都能提供的任何帮助都将非常棒!谢谢!

I'm sure this is a simple fix that will make sense when I see it, but at the moment I am stumped. Any help you all can offer would be awesome! Thank you!

推荐答案

尝试一下

from selenium import webdriver
import time 

chrome_path = r"/Users/ENTER/Desktop/chromedriver"

driver = webdriver.Chrome(chrome_path)

site_url = 'https://www.home-school.com/groups/'

driver.get(site_url)

# get state links from sidebar and store to list
area = driver.find_element_by_xpath("/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div")
items = area.find_elements_by_tag_name('a')

# remove unneeded links
del items[:22]
del items[-1:]

text_list = [i.text for i in items]
items = [i.get_attribute("href") for i in items]

for i in range(len(items)):
    driver.get(items[i])
    # you have to wait for the next element to display
    time.sleep(2)
    # assign html container with desired data to variable
    element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
    # Store container text in variable. We skip the first 5 lines of text as they 
    #  are unnecessary.
    orgdata = element.text.split("\n",5)[5]
    orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
    # Write data to text file
    filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + text_list[i] + '.txt'
    file_object = open(filepath, 'a')
    file_object.write(orgdata)
    file_object.close()

这篇关于python&Selenium:遍历WebElements列表错误:StaleElementReferenceException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆