写大的文本文件的数据导入Excel [英] Write large text file data into excel

查看:238
本文介绍了写大的文本文件的数据导入Excel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读了一些分隔符分隔的文本文件。

I am reading a text file separated with some delimiters.

我的文本文件内容示例

AVC高清EFG jksjd结果
1 2 3 5结果
3 4 6 0

Avc def efg jksjd
1 2 3 5
3 4 6 0

逐行和使用HashMap中有行号的整数类型的密钥保存在内存中,并
文本文件中的每一行列出对象

line by line and holding it in memory using hashmap having line numbers as key of integer type and each line of text file as List object

考虑,我的地图将存储这样的信息。

Consider, my map would store information like this

整数列表

1 [AVC高清EFG jksjd]

1 [Avc def efg jksjd]

我使用Apache POI写入到Excel中。
当写入到Excel中使用Apache POI,我下面这个方法,这里是我的code段

I am using Apache POI to write into excel. When writing into excel using Apache POI, I am following this approach, here is my code snippet

HSSFWorkbook workbook = new HSSFWorkbook();
HSSFSheet sheet = workbook.createSheet("Sample sheet");
Map<Integer, List<Object>> excelDataHolder = new LinkedHashMap<Integer, List<Object>>();
int rownum = 0;
for (Integer key : keyset) {
            Row row = sheet.createRow(rownum++);
            List<Object> objList = excelHolder.get(key);//excelHolder is my map
            int cellnum = 0;
            for (Object obj : objList) {
                Cell cell = row.createCell(cellnum++);
                    cell.setCellValue((Date) obj);
            }
}

此工作得很好,如果行/记录将要写入的excel数量较少。试想一下,如果记录是在数十亿,或者如果文本文件有更多的行承担100万。我想,我的方法失败,因为
createRow和createCell创建堆超过10万的对象。
无论Java的创先争优的API,我觉得写进去(EXCEL)是基于同样的方法即..,如上图所示集合迭代。我做了一些的Aspose例子还有,结果也阅读Aspose有同样的问题,我猜。

This works quite well if the number of lines/records to be written into excel are less. Imagine, if the records are in billion number or if the text file has more lines assume in 100 000. I think, my approach fails, because createRow and createCell creates more than 100 000 objects in heap. Whatever the java to excel api, I think writing into it(excel) is based on the same approach i.e.., iteration of collection as shown above. I did some examples with aspose as well, as a result aspose also have the same problem I guess.


  • 是否createRow和createCell创建新的对象,每次他们叫什么名字?

  • 如果是,什么是另类?我怎么会写大数据具有更好的性能脱颖而出?

推荐答案

最近的一个Apache-POI的版本有的 sxssf 。从网站复制无耻

A recent version of apache-poi has sxssf. Shameless copy from website

SXSSF(包:org.apache.poi.xssf.streaming)是一个API兼容
  要使用XSSF的流媒体扩展时非常大S preadsheets
  必须生产和堆空间有限。 SXSSF实现其低
  通过限制访问不到该行的内存占用
  滑动窗口,而XSSF使文档中获取所有行。
  旧的行不再在窗口变得不可访问,因为
  它们被写入到磁盘。

SXSSF (package: org.apache.poi.xssf.streaming) is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk.

我已经用它的150万行创建小号preadsheet。

I had used it for creating spreadsheet with 1.5 million rows.

这篇关于写大的文本文件的数据导入Excel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆