将Python Selenium输出写入Excel [英] Writing Python Selenium output to Excel

查看:471
本文介绍了将Python Selenium输出写入Excel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个脚本来从在线网站上抓取产品信息.目的是将这些信息写到Excel文件中.由于我有限的Python知识,我只知道如何使用Powershell中的Out-file进行导出.但是结果是每种产品的信息都打印在单独的行上.我希望每种产品只有一条生产线.

I have written a script to scrape product information from online websites. The goal is to write these information out to an Excel file. Due to my limited Python knowledge, I only know how to export using Out-file in Powershell. But the result is that information for each product is printed on separate lines. I would prefer there to be one line per product.

我想要的输出可以在图片中看到.我希望输出看起来像第二个版本,但是我可以接受第一个版本.

My desired output can be seen in the picture. I would prefer to my output to look like the second version, but I can live with the first one.

这是我的代码:

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException    

url = "http://www.strem.com/"
cas = ['16940-92-4','29796-57-4','13569-57-8','15635-87-7']

for i in cas:
    driver = webdriver.Firefox()
    driver.get(url)

    driver.find_element_by_id("selectbox_input").click()
    driver.find_element_by_id("selectbox_input_cas").click()

    inputElement = driver.find_element_by_name("keyword")
    inputElement.send_keys(i)
    inputElement.submit()

    # Check if a particular element exists; returns True/False          
    def check_exists_by_xpath(xpath):
        try:
            driver.find_element_by_xpath(xpath)
        except NoSuchElementException:
            return False
        return True

    xpath1 = ".//div[@class = 'error']" # element containing error message
    xpath2 = ".//table[@class = 'product_list tiles']" # element containing table to select product from
    #xpath3 = ".//div[@class = 'catalog_number']" # when selection is needed, returns the first catalog number

    if check_exists_by_xpath(xpath1):
        print "cas# %s is not found on Strem." %i
        driver.quit() 
    else:
        if check_exists_by_xpath(xpath2):
            catNum = driver.find_element_by_xpath(".//div[@class = 'catalog_number']")
            catNum.click()

            country = driver.find_element_by_name("country")
            for option in country.find_elements_by_tag_name('option'):
                if option.text == "USA":
                    option.click()
            country.submit()

            name = driver.find_element_by_id("header_description").text
            prodNum = driver.find_element_by_class_name("catalog_number").text
            print(i)
            print(name.encode("utf-8"))
            print(prodNum)

            skus_by_xpath = WebDriverWait(driver, 10).until(
                lambda driver : driver.find_elements_by_xpath(".//td[@class='size']")
            )

            for output in skus_by_xpath:
                print(output.text)

            prices_by_xpath = WebDriverWait(driver, 10).until(
                lambda driver : driver.find_elements_by_xpath(".//td[@class='price']")
            )

            for result in prices_by_xpath:
                print(result.text[3:]) #To remove last three characters, use :-3

            driver.quit()
        else:
            country = driver.find_element_by_name("country")
            for option in country.find_elements_by_tag_name('option'):
                if option.text == "USA":
                    option.click()
            country.submit()

            name = driver.find_element_by_id("header_description").text
            prodNum = driver.find_element_by_class_name("catalog_number").text
            print(i)
            print(name.encode("utf-8"))
            print(prodNum)

            skus_by_xpath = WebDriverWait(driver, 10).until(
                lambda driver : driver.find_elements_by_xpath(".//td[@class='size']")
            )

            for output in skus_by_xpath:
                print(output.text)

            prices_by_xpath = WebDriverWait(driver, 10).until(
                lambda driver : driver.find_elements_by_xpath(".//td[@class='price']")
            )

            for result in prices_by_xpath:
                print(result.text[3:]) #To remove last three characters, use :-3

            driver.quit()

推荐答案

https://pythonhosted.org/openpyxl/tutorial.html

这是有关允许对python进行操作的python库的教程 还有其他库,但我喜欢使用这一库.

This is a tutorial for a python library that allows manipulation for python There are other libraries but I like using this one.

来自openpyxl导入工作簿 wb = Workbook()

from openpyxl import Workbook wb = Workbook()

然后使用给定的方法写入数据 然后

then use the methods given to write your data and then

wb.save(文件名)

wb.save(filename)

非常容易上手.

这是使用xlwt和xlrd的pdf教程,但是我并没有真正使用这些模块. http://www.simplistix.co.uk/presentations/python-excel.pdf

This is a pdf tutorial for using xlwt and xlrd, but I don't really use these modules alot. http://www.simplistix.co.uk/presentations/python-excel.pdf

这篇关于将Python Selenium输出写入Excel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆