从大文件中提取数据Excel [英] Extract data from large files excel

查看:136
本文介绍了从大文件中提取数据Excel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Pentaho数据集成来创建从xlsx文件到mysql的转换,但是我无法使用Excel 2007 xlsx(apache POI Straiming)从大文件中导入数据.它给了我内存不足的错误.

I'm using Pentaho Data Integration to create a transformation from xlsx files to mysql, but I can't import data from large files with Excel 2007 xlsx(apache POI Straiming). It gives me out of memory errors.

推荐答案

我建议您在运行转换之前增加jvm内存分配.默认情况下,pentaho数据集成(也称为水壶)具有较低的内存分配,这会在运行涉及大文件的ETL时引起问题.您需要修改-Xmx值,以使其相应地在spoon.bat中指定更大的内存上限.

I would recommend you to increase jvm memory allocation before running the transformation. By default, pentaho data integration aka kettle comes with low memory allocation which would cause issues with running ETLs involving large files. You would need to modify the -Xmx value so that it specifies a larger upper memory limit in spoon.bat accordingly.

如果您在Windows中使用汤匙,并在下面的行中编辑"spoon.bat".

If you are using spoon in windows and edit spoon.bat in the line show below.

if "%PENTAHO_DI_JAVA_OPTIONS%"=="" set PENTAHO_DI_JAVA_OPTIONS="-Xmx512m" "-XX:MaxPermSize=256m"

如果使用的是厨房或平底锅,请在相应的pan.bat或kitchen.bat中进行编辑.如果您使用的是Linux,请更改.sh文件.

If you are using kitchen or pan, edit in those pan.bat or kitchen.bat accordingly. If you are using in linux, change in .sh files.

这篇关于从大文件中提取数据Excel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆