poi 读取现有的 excel 并使用大数据进行编辑 [英] poi read existing excel and edit with large data

查看:33
本文介绍了poi 读取现有的 excel 并使用大数据进行编辑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 apache poi 来读取和写入 excel.我可以毫无问题地做到这一点.但是在现有 excel 上编写大型 excel 时我面临内存问题.下面有更详细的解释我有模板,其中列标题被着色,有些被合并(我说的是一张纸中有 100 到 150 列),我大约有 10 张纸.现在我从数据库中读取数据并将数据填充到这个 excel 中.现在对于小数据集,我可以做,但对于大数据集,它会抛出内存不足.尽管我从 db 中以小块的形式获取数据,例如 1000 行数据,但一次处理 1000 行,然后再次获得另一组数据.每张数据的最大行数为 50,000 行,每张纸中的最小行数为一行

I using apache poi to read and write excel .I am able to do that without any problem.But i am facing memory issues when writing large excels on existing excel.To Explained in more detail below I have template where column headers are colored some are merged (i am talking about 100 to 150 columns in one sheet) this i have around 10 sheets approx. Now i read data from database and populate the data into this excel. Now for small sets of data i am able to do but for large data sets it throws out of memory. Although i get the data from db in small chunks like 1000 rows of data at once proccess that 1000 and again get another set. Maxiumn rows of data will be 50,000 rows per sheet and minumun rows will be one row in each sheet

我已经阅读了 SXSSF,但问题是它不能在现有的 excel 上进行编辑.

I have read about SXSSF but the issue is it does not edit on existing excels.

请告诉我您对如何解决此问题的想法.

Please let me know your thoughts on how this can be solved.

推荐答案

有几个选项可用,但没有一个是银弹"的简单答案.不幸的是,Excel 文件格式确实需要相当多的内存才能使用.

There are a few options available, but none that are a "silver bullet" simple answer. Unfortunately, the Excel file formats do require a fair bit of memory to work with.

  1. 增加分配给 Java 的堆大小,以便它一次可以容纳更多内存
  2. 为您的服务器购买更多内存,然后增加堆大小.内存现在很便宜,而且不需要一个像样的程序员的很多小时工资来支付成本
  3. 使用不同的文件格式 - 例如 CSV 在内存占用方面非常轻量
  4. 使用低级 SAX API 读取 .xlsx文件,然后使用 SXSSF 再次将其写出,无需太多缓冲.虽然需要大量自定义编码,但比仅使用 HSSF/XSSF 需要更多的工作
  5. 贡献/赞助内存改进对 Apache POI 的贡献.POI 是开源的、免费提供的,并且完全由志愿者维护,因此如果有什么事情对您很重要,您通常最好提供帮助!
  1. Increase the heap size given to Java, so it can hold more in memory at a time
  2. Buy some more memory for your server, then increase the heap size. Memory is pretty cheap these days, and it doesn't take very many hours wage of a decent programmer to cover the cost
  3. Use a different file format - CSV for example is very lightweight on the memory footprint
  4. Use the low level SAX API to read in the .xlsx file, then use SXSSF to write it out again without much buffering. Needs quite a lot of custom coding though, much more work for you than just using HSSF/XSSF
  5. Contribute / sponsor the contribution of memory improvements to Apache POI. POI is open source, freely available, and maintained entirely by volunteers, so if something matters to you you're often best off helping!

这篇关于poi 读取现有的 excel 并使用大数据进行编辑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆