python&Selenium:遍历WebElements列表错误:StaleElementReferenceException [英] Python & Selenium: Iterate through list of WebElements Error: StaleElementReferenceException
问题描述
下午好,
Python和网络抓取有些新功能,所以任何帮助将不胜感激!首先:
Somewhat new to Python and webscraping, so any help would be greatly appreciated! First:
from selenium import webdriver
import time
chrome_path = r"/Users/ENTER/Desktop/chromedriver"
driver = webdriver.Chrome(chrome_path)
site_url = 'https://www.home-school.com/groups/'
driver.get(site_url)
# get state links from sidebar and store to list
area = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
items = area.find_elements_by_tag_name('a')
# remove unneeded links
del items[:22]
del items[-1:]
#
for links in items:
# print(links.text)
print(links.get_attribute("href"))
# add link related logic here
links.click()
# you have to wait for the next element to display
time.sleep(4)
# assign html container with desired data to variable
element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[4]/div""")
# Store container text in variable. We skip the first 5 lines of text as they
# are unnecessary.
orgdata = element.text.split("\n",5)[5]
orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
# Write data to text file
filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + links.text + '.txt'
file_object = open(filepath, 'a')
file_object.write(orgdata)
问题
我正在使用Selenium,试图从 http://home保存家庭学校团体的名称和信息-school.com/groups/到每个州的单个文本文件.
The Problem
I am using Selenium in an attempt to save the names and information of homeschool groups from http://home-school.com/groups/ to individual text files per state.
为此,我保存了一个链接列表,并希望遍历该列表以单击每个链接,执行与抓取所需数据,操作文本以及按状态输出到单独的文本文件有关的任务.
To do this, I have saved a list of links and would like to iterate through the list to click each link, perform tasks related to scraping the desired data, manipulating the text, and outputting to separate text files per state.
尝试执行"for"循环时,我得到 StaleElementReferenceException:失效元素引用:元素未附加到页面文档
.
I am getting StaleElementReferenceException: stale element reference: element is not attached to the page document
when attempting to performing the "for" Loop.
我相信当它到达 element = driver.find_element_by_xpath(""/html/body/center/table/tbody/tr/td/table [3]/tbody/tr/td [2]/div")
.据我所知,这个xpath不会改变.我以为我需要让网络驱动程序等待页面加载,因此是 time.sleep(4)
.
I believe it is giving the error when it gets to element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
. As far as I can tell, this xpath does not change. I assumed I needed to make the webdriver wait for the page to load, hence time.sleep(4)
.
我确定这是一个简单的修复程序,当我看到它时会很有意义,但此刻我很沮丧.所有人都能提供的任何帮助都将非常棒!谢谢!
I'm sure this is a simple fix that will make sense when I see it, but at the moment I am stumped. Any help you all can offer would be awesome! Thank you!
推荐答案
尝试一下
from selenium import webdriver
import time
chrome_path = r"/Users/ENTER/Desktop/chromedriver"
driver = webdriver.Chrome(chrome_path)
site_url = 'https://www.home-school.com/groups/'
driver.get(site_url)
# get state links from sidebar and store to list
area = driver.find_element_by_xpath("/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div")
items = area.find_elements_by_tag_name('a')
# remove unneeded links
del items[:22]
del items[-1:]
text_list = [i.text for i in items]
items = [i.get_attribute("href") for i in items]
for i in range(len(items)):
driver.get(items[i])
# you have to wait for the next element to display
time.sleep(2)
# assign html container with desired data to variable
element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
# Store container text in variable. We skip the first 5 lines of text as they
# are unnecessary.
orgdata = element.text.split("\n",5)[5]
orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
# Write data to text file
filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + text_list[i] + '.txt'
file_object = open(filepath, 'a')
file_object.write(orgdata)
file_object.close()
这篇关于python&Selenium:遍历WebElements列表错误:StaleElementReferenceException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!