Excel 工作表 POI 验证:内存不足错误 [英] Excel sheet POI validation: Out Of Memory Error

查看:46
本文介绍了Excel 工作表 POI 验证:内存不足错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Java 验证 Excel 文件,然后再将其转储到数据库.

I am trying to validate an Excel file using Java before dumping it to database.

这是我导致错误的代码片段.

Here is my code snippet which causes error.

try {
    fis = new FileInputStream(file);
    wb = new XSSFWorkbook(fis);
    XSSFSheet sh = wb.getSheet("Sheet1");
    for(int i = 0 ; i < 44 ; i++){
        XSSFCell a1 = sh.getRow(1).getCell(i);
        printXSSFCellType(a1);
    }
    
} catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

这是我得到的错误

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.ArrayList.<init>(Unknown Source)
    at java.util.ArrayList.<init>(Unknown Source)
    at org.apache.xmlbeans.impl.values.NamespaceContext$NamespaceContextStack.<init>(NamespaceContext.java:78)
    at org.apache.xmlbeans.impl.values.NamespaceContext$NamespaceContextStack.<init>(NamespaceContext.java:75)
    at org.apache.xmlbeans.impl.values.NamespaceContext.getNamespaceContextStack(NamespaceContext.java:98)
    at org.apache.xmlbeans.impl.values.NamespaceContext.push(NamespaceContext.java:106)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1273)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.stringValue(XmlObjectBase.java:1484)
    at org.apache.xmlbeans.impl.values.XmlObjectBase.getStringValue(XmlObjectBase.java:1492)
    at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTCellImpl.getR(Unknown Source)
    at org.apache.poi.xssf.usermodel.XSSFCell.<init>(XSSFCell.java:105)
    at org.apache.poi.xssf.usermodel.XSSFRow.<init>(XSSFRow.java:70)
    at org.apache.poi.xssf.usermodel.XSSFSheet.initRows(XSSFSheet.java:179)
    at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:143)
    at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:130)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:286)
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:207)
    at com.xls.validate.ExcelValidator.main(ExcelValidator.java:79)

当 .xlsx 文件小于 1 MB 时,这完全正常.

This works perfectly fine when the .xlsx file is less than 1 MB.

我理解这是因为我的 .xlsx 文件大约 5-10 MB,并且 POI 尝试在 JVM 内存中一次加载整个工作表.

I understand this is because my .xlsx file is around 5-10 MB and POI tries to load the entire sheet at once in JVM memory.

可能的解决方法是什么?

What can be a possible workaround?

推荐答案

有两个选项可供您使用.选项 #1 - 增加 JVM 堆的大小,以便 Java 有更多可用内存.使用 UserModel 代码处理 POI 中的 Excel 文件是基于 DOM 的,因此需要将整个文件(包括解析的表单)缓存到内存中.尝试类似这样的问题以获取有关如何增加帮助的建议.

There are two options available to you. Option #1 - increase the size of your JVM Heap, so that Java has more memory available to it. Processing Excel files in POI using the UserModel code is DOM based, so the whole file (including parsed form) needs to be buffered into memory. Try a question like this one for advice on how to increase the help.

选项#2,工作量更大 - 切换到基于事件 (SAX) 的处理.这一次只处理文件的一部分,因此需要的内存要少得多.然而,它需要你做更多的工作,这就是为什么你最好在这个问题上多投入几 GB 的内存——内存很便宜,而程序员不是!SpreadSheet howto 页面 提供了有关如何对 .xlsx 文件,还有 POI 提供的各种示例文件 您可以查看以寻求建议.

Option #2, which is more work - switch to event based (SAX) processing. This only processes part of the file at a time, so needs much much less memory. However, it requires more work from you, which is why you might be better throwing a few more GB of memory at the problem - memory is cheap while programmers aren't! The SpreadSheet howto page has instructions on how to do SAX parsing of .xlsx files, and there are various example files provided by POI you can look at for advice.

.

另外,您似乎正在通过流加载文件,这很糟糕,因为这意味着更多的东西需要缓冲到内存中.请参阅 POI 文档以了解更多信息,包括有关如何直接处理文件.

Also, another thing - you seem to be loading a File via a stream, which is bad as it means even more stuff needs buffering into memory. See the POI Documentation for more on this, including instructions on how to work with the File directly.

这篇关于Excel 工作表 POI 验证:内存不足错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆