未实现 read_excel 的 chunksize 关键字 [英] chunksize keyword of read_excel is not implemented
问题描述
在 0.16.1 版本中,chunksize
参数可用.
In version 0.16.1 the chunksize
argument was available.
参见:http:///pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html
但在最新版本中它不可用.
But in latest version it's not available.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html
被删除的原因是什么?
另外,我应该如何在最新版本中分块处理excel文件?
Also, how should I process excel file by chunks in latest version?
我曾经在下面做:
import pandas as pd
excel = pd.ExcelFile("test.xlsx")
for sheet in excel.sheet_names:
reader = excel.parse(sheet, chunksize=1000)
for chunk in reader:
# process chunk
推荐答案
As EdChum 在评论中解释,此功能已在 0.17.0 中删除.Chris 在评论中给出了相同的原因:
As EdChum explained in the comment, this feature was removed in 0.17.0. Chris gave below reason for the same in the comment:
没有特别令人信服的理由;主要想法是匹配to_excel 的 api,即ExcelFileWrapper"(ExcelFile、ExcelWriter)没有任何特定于熊猫的功能,而是通过它进入io函数(read_excel,to_excel).
there's no super-compelling reason; the main idea was to match up with api of to_excel, i.e. the "ExcelFileWrapper" (ExcelFile, ExcelWriter) doesn't have any pandas-specific functionality, instead you pass it into the io functions (read_excel, to_excel).
我确实更新了文档以涵盖该特定示例.虽然在差异中可能很难看到 - 在下面呈现.
I did update the docs to cover that specific example. edit: although it may be hard to see in the diff - rendered below.
来源:https://github.com/pandas-dev/pandas/pull/11198
我仍然想知道是否有其他方法可以批量读取 excel?
I still wonder if there's any alternate way to read excel in chunks?
这篇关于未实现 read_excel 的 chunksize 关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!