是否可以在不将整个文档加载到内存的情况下获取Excel文档的行数? [英] Is it possible to get an Excel document's row count without loading the entire document into memory?

查看:78
本文介绍了是否可以在不将整个文档加载到内存的情况下获取Excel文档的行数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理可处理大量Excel 2007文件的应用程序,并且正在使用 OpenPyXL 做到这一点. OpenPyXL有两种不同的读取Excel文件的方法:一种是将整个文档立即加载到内存中的常规"方法,另一种是使用迭代器逐行读取的方法.

I'm working on an application that processes huge Excel 2007 files, and I'm using OpenPyXL to do it. OpenPyXL has two different methods of reading an Excel file - one "normal" method where the entire document is loaded into memory at once, and one method where iterators are used to read row-by-row.

问题是,当我使用迭代器方法时,我没有得到任何文档元数据,例如列宽和行/列数,而我确实需要此数据.我假设这些数据存储在顶部附近的Excel文档中,因此不必将整个10MB文件加载到内存中就可以访问它.

The problem is that when I'm using the iterator method, I don't get any document meta-data like column widths and row/column count, and i really need this data. I assume this data is stored in the Excel document close to the top, so it shouldn't be necessary to load the whole 10MB file into memory to get access to it.

那么,有没有一种方法可以在不首先将整个文档加载到内存的情况下掌握行/列的数量和列的宽度?

So, is there a way to get ahold of the row/column count and column widths without loading the entire document into memory first?

推荐答案

再加上Hubro所说的话,显然get_highest_row()已被弃用.使用max_rowmax_column属性返回行数和列数.例如:

Adding on to what Hubro said, apparently get_highest_row() has been deprecated. Using the max_row and max_column properties returns the row and column count. For example:

    wb = load_workbook(path, use_iterators=True)
    sheet = wb.worksheets[0]

    row_count = sheet.max_row
    column_count = sheet.max_column

这篇关于是否可以在不将整个文档加载到内存的情况下获取Excel文档的行数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆