在处理大量的Java文件的xlsx [英] Processing large xlsx file in Java

查看:477
本文介绍了在处理大量的Java文件的xlsx的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要自动调整所有行大(30K +行)XLSX文件。

I need to auto-fit all rows in large (30k+ rows) xlsx file.

通过Apache POI以下code适用于小文件,但出去与的OutOfMemoryError 的路数:

The following code via apache poi works on small files, but goes out with OutOfMemoryError on large ones:

Workbook workbook = WorkbookFactory.create(inputStream);
Sheet sheet = workbook.getSheetAt(0);

for (Row row : sheet) {
    row.setHeight((short) -1);
}

workbook.write(outputStream);

更新:不幸的是,增加堆大小是不是一种选择 - 的OutOfMemoryError 出现在 -Xmx1024m 和30K行不是上限。

Update: Unfortunately, increasing heap size is not an option - OutOfMemoryError appears at -Xmx1024m and 30k rows is not an upper limit.

推荐答案

尝试使用事件API。请参见事件API(HSSF只)和的XSSF和SAX(事件API)有关详细信息,POI文档中获得。从该页面一对夫妇报价:

Try using the event API. See Event API (HSSF only) and XSSF and SAX (Event API) in the POI documentation for details. A couple of quotes from that page:

HSSF:

事件API比用户API更新。其目的是为开发者中间谁是愿意学习的底层API结构一点。它的使用比较简单,但需要一个Excel文件(或愿意学习)的部分有基本的了解。所提供的好处是,你可以用一个相对较小的内存占用读取XLS。

The event API is newer than the User API. It is intended for intermediate developers who are willing to learn a little bit of the low level API structures. Its relatively simple to use, but requires a basic understanding of the parts of an Excel file (or willingness to learn). The advantage provided is that you can read an XLS with a relatively small memory footprint.

XSSF:

如果内存占用的问题,那么对于XSSF,你可以在底层的XML数据,并自己进行处理。这是有意的中级开发谁愿意学习的.xlsx文件低水平结构的一点点,谁在Java幸福处理XML。它的使用比较简单,但需要的文件结构的一个基本的了解。所提供的好处是,你可以读取一个文件,XLSX具有相对小的内存占用。

If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself. This is intended for intermediate developers who are willing to learn a little bit of low level structure of .xlsx files, and who are happy processing XML in java. Its relatively simple to use, but requires a basic understanding of the file structure. The advantage provided is that you can read a XLSX file with a relatively small memory footprint.

对于输出,一个可能的方案是在博客文章<一个描述href=\"http://web.archive.org/web/20110821054135/http://www.realdevelopers.com/blog/$c$c/excel\">Streaming XLSX文件。 (基本上,使用XSSF生成一个容器XML文件,然后流实际的内容作为纯文本到XLSX zip归档适当的XML的一部分。)

For output, one possible approach is described in the blog post Streaming xlsx files. (Basically, use XSSF to generate a container XML file, then stream the actual content as plain text into the appropriate xml part of the xlsx zip archive.)

这篇关于在处理大量的Java文件的xlsx的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆