硒:无法将报废的元素放入csv [英] Selenium : Unable to put scraped elements to csv

查看:87
本文介绍了硒:无法将报废的元素放入csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经成功地从网站上抓取了数据.好吧,这让我返回了错误.

I have successfully scraped the data from the site. Well it's returning me error.

我使用"Title1" : pd.Series([ ele for ele.text in elements ])将数据存储到csv文件中,但是当我对.text方法使用元素时,返回了我未定义name "ele"的错误.

I used "Title1" : pd.Series([ ele for ele.text in elements ]) for storing data to csv file, but returns me error that name "ele" not defined when i use element to .text method .

当我删除.text时,它运行正常.但是存储的id并非文本形式,所以这就是为什么我使用.text的原因.使用.text会发生什么?

When i remove .text, then it runs fine. But stores the id's which are not in form of text, so that's why i used .text. What is happening with usage of .text?

这是我的代码:

element = WebDriverWait(driver, 5).until(
    EC.presence_of_element_located((By.CSS_SELECTOR, x))
)
elements = driver.find_elements_by_css_selector(x)

element = WebDriverWait(driver, 5).until(
    EC.presence_of_element_located((By.CSS_SELECTOR, y))
)
elements2 = driver.find_elements_by_css_selector(y)

element = WebDriverWait(driver, 5).until(
    EC.presence_of_element_located((By.CSS_SELECTOR, z))
)
elements3 = driver.find_elements_by_css_selector(z)

df = pd.DataFrame({
    "Title1" : pd.Series([ ele for ele.text in elements ]),
    "Title2" : pd.Series([ ele2 for ele2.text in elements2 ]),
    "Title3" : pd.Series([ ele3 for ele3.text in elements3 ]),
})

df.to_csv(csv_file_location,
          index=False, mode='a', encoding='utf-8')

只需删除文本,然后查看它是否可以正常工作,并将所有数据存储到csv中,而不是将其存储为文本.任何帮助将不胜感激...

Just remove the text and see that it works fine and stores all the data to csv but not as text. Any help would be appreciated...

推荐答案

此方法.text()仅用于获取元素的数组/列表.例如,在这一行中,

This method .text() is simply used to fetch an array/list of elements. In this line for example,

elements = driver.find_elements_by_css_selector(x)

这就是循环pd.Series([ ele for ele.text in elements ]),失败的原因,因此删除文本可以按预期运行.

This is why your loop pd.Series([ ele for ele.text in elements ]), fails, and so removing the text runs fine as expected.

因此更改此

pd.Series([ ele for ele.text in elements ])

对此

pd.Series([ ele.text for ele in elements ])

这意味着它将首先在elements中首先获取ele,然后在其中获取eletext属性.

This means that it would first obtain ele first in elements and within that obtain the text attribute of the ele.

这篇关于硒:无法将报废的元素放入csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆