处理 Excel 数据时选择 pandas 而不是 xlsxwriter [英] Choosing pandas over xlsxwriter when working with Excel data

查看:26
本文介绍了处理 Excel 数据时选择 pandas 而不是 xlsxwriter的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

既然 Pandas 使用了 xlsxwriter 模块,那么直接使用 xlsxwriter 就可以了,为什么还要使用 Pandas?

Since Pandas uses the xlsxwriter module, why bother using Pandas when one can just use xlsxwriter directly?

也许要回答的更直接的问题是,在处理 Excel 数据时,为什么要考虑用 Pandas 替换 xlsxwriter?

Maybe a more direct question to answer is, why should one consider replacing xlsxwriter with Pandas when working with excel data?

我对这个问题的目标是帮助人们决定在处理 Excel 数据时是使用 xlsxwriter 还是 Pandas.

My goal with this question is to help one decide whether to use xlsxwriter or Pandas when working with Excel data.

推荐答案

一句话:方便.在处理数据时,从/向 Excel 电子表格读取和写入是一项非常的任务.例如,以下是如何从 xlsxwriter 教程:

One word: convenience. Reading and writing from/to Excel spreadsheet is a very common task when dealing with data. As an example, here's how to create a dead-simple Excel file from xlsxwriter tutorial:

import xlsxwriter

# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()

# Some data we want to write to the worksheet.
expenses = (
    ['Rent', 1000],
    ['Gas',   100],
    ['Food',  300],
    ['Gym',    50],
)

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for item, cost in (expenses):
    worksheet.write(row, col,     item)
    worksheet.write(row, col + 1, cost)
    row += 1

# Write a total using a formula.
worksheet.write(row, 0, 'Total')
worksheet.write(row, 1, '=SUM(B1:B4)')

workbook.close()

将其与熊猫进行比较:

import pandas as pd

df = pd.DataFrame({
    'Amount': [1000, 100, 300, 50]
}, index=['Rent', 'Gas', 'Food', 'Gym'])
df.loc['Total', 'Amount'] = df['Amount'].sum()

df.to_excel('Expenses01.xlsx', index=False)

当然,它们并不完全相等.xlsxwriter 为总和创建了一个公式,但是您必须编写的样板代码数量非常庞大.df.to_excel 是一个将数据帧转储到 Excel 的简单命令.您几乎无法控制生成的文件,但根据您的要求,您甚至可能不需要它.

They are not exactly equal of course. xlsxwriter creates a formula for the sum, but the amount of boilerplatte code you have to write is montrous. df.to_excel is a simple command that dumps the dataframe to Excel. You have little control over the resultant file but depending on your requirements, you may not even need that.

它们是为两个完全不同的目的而设计的两个库.pandas 提供了与 xlsxwriter 的集成并不意味着您应该始终选择一个.需要方便时使用 df.to_excel,需要精细控制时使用 xlsxwriter.

They are two libraries designed for 2 totally different purposes. pandas provide an integration with xlsxwriter doesn't mean that you should pick one over the other all the times. Use df.to_excel when you need convenience and xlsxwriter when you want fine control.

这篇关于处理 Excel 数据时选择 pandas 而不是 xlsxwriter的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆