硒:无法将报废的元素放入csv [英] Selenium : Unable to put scraped elements to csv
问题描述
我已经成功地从网站上抓取了数据.好吧,这让我返回了错误.
I have successfully scraped the data from the site. Well it's returning me error.
我使用"Title1" : pd.Series([ ele for ele.text in elements ])
将数据存储到csv文件中,但是当我对.text
方法使用元素时,返回了我未定义name "ele"
的错误.
I used "Title1" : pd.Series([ ele for ele.text in elements ])
for storing data to csv file, but returns me error that name "ele"
not defined when i use element to .text
method .
当我删除.text
时,它运行正常.但是存储的id并非文本形式,所以这就是为什么我使用.text
的原因.使用.text
会发生什么?
When i remove .text
, then it runs fine. But stores the id's which are not in form of text, so that's why i used .text
. What is happening with usage of .text
?
这是我的代码:
element = WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.CSS_SELECTOR, x))
)
elements = driver.find_elements_by_css_selector(x)
element = WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.CSS_SELECTOR, y))
)
elements2 = driver.find_elements_by_css_selector(y)
element = WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.CSS_SELECTOR, z))
)
elements3 = driver.find_elements_by_css_selector(z)
df = pd.DataFrame({
"Title1" : pd.Series([ ele for ele.text in elements ]),
"Title2" : pd.Series([ ele2 for ele2.text in elements2 ]),
"Title3" : pd.Series([ ele3 for ele3.text in elements3 ]),
})
df.to_csv(csv_file_location,
index=False, mode='a', encoding='utf-8')
只需删除文本,然后查看它是否可以正常工作,并将所有数据存储到csv中,而不是将其存储为文本.任何帮助将不胜感激...
Just remove the text and see that it works fine and stores all the data to csv but not as text. Any help would be appreciated...
推荐答案
此方法.text()
仅用于获取元素的数组/列表.例如,在这一行中,
This method .text()
is simply used to fetch an array/list of elements. In this line for example,
elements = driver.find_elements_by_css_selector(x)
这就是循环pd.Series([ ele for ele.text in elements ]),
失败的原因,因此删除文本可以按预期运行.
This is why your loop pd.Series([ ele for ele.text in elements ]),
fails, and so removing the text runs fine as expected.
因此更改此
pd.Series([ ele for ele.text in elements ])
对此
pd.Series([ ele.text for ele in elements ])
这意味着它将首先在elements
中首先获取ele
,然后在其中获取ele
的text
属性.
This means that it would first obtain ele
first in elements
and within that obtain the text
attribute of the ele
.
这篇关于硒:无法将报废的元素放入csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!