未实现 read_excel 的 chunksize 关键字 [英] chunksize keyword of read_excel is not implemented

查看:259
本文介绍了未实现 read_excel 的 chunksize 关键字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 0.16.1 版本中,chunksize 参数可用.

In version 0.16.1 the chunksize argument was available.

参见:http:///pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html

但在最新版本中它不可用.

But in latest version it's not available.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html

被删除的原因是什么?

另外,我应该如何在最新版本中分块处理excel文件?

Also, how should I process excel file by chunks in latest version?

我曾经在下面做:

import pandas as pd

excel = pd.ExcelFile("test.xlsx")

for sheet in excel.sheet_names:
    reader = excel.parse(sheet, chunksize=1000)
    for chunk in reader:
        # process chunk

推荐答案

As EdChum 在评论中解释,此功能已在 0.17.0 中删除.Chris 在评论中给出了相同的原因:

As EdChum explained in the comment, this feature was removed in 0.17.0. Chris gave below reason for the same in the comment:

没有特别令人信服的理由;主要想法是匹配to_excel 的 api,即ExcelFileWrapper"(ExcelFile、ExcelWriter)没有任何特定于熊猫的功能,而是通过它进入io函数(read_excel,to_excel).

there's no super-compelling reason; the main idea was to match up with api of to_excel, i.e. the "ExcelFileWrapper" (ExcelFile, ExcelWriter) doesn't have any pandas-specific functionality, instead you pass it into the io functions (read_excel, to_excel).

我确实更新了文档以涵盖该特定示例.虽然在差异中可能很难看到 - 在下面呈现.

I did update the docs to cover that specific example. edit: although it may be hard to see in the diff - rendered below.

来源:https://github.com/pandas-dev/pandas/pull/11198

我仍然想知道是否有其他方法可以批量读取 excel?

I still wonder if there's any alternate way to read excel in chunks?

这篇关于未实现 read_excel 的 chunksize 关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆