xlConnect R使用JVM内存 [英] xlConnect R use of JVM memory

查看:106
本文介绍了xlConnect R使用JVM内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中使用 XLConnect (Mirai解决方案)遇到了JVM内存问题.

I'm running into a problem with JVM memory using XLConnect (Mirai Solutions) in R.

使用loadWorkbookreadWorksheetFromFile数据可以很好地加载到R中,但是在使用任何导出功能(writeNamedRegion等),R停止响应.

Data loads into R just fine using loadWorkbook or readWorksheetFromFile, but larger data (data frames about 3MB) get stuck while being written to the JVM during export with any of the export functions (writeNamedRegion, writeWorksheetToFile, etc.), and R stops responding.

我已经使用options(java.parameters = "-Xmx1500m")重置了Java参数,这增加了我能够导出到Excel的数据帧的大小,但是R仍然减慢了1MB左右,无法使用3MB左右.

I've reset the java parameters using options(java.parameters = "-Xmx1500m"), and this increased the size of the data frames I was able to export to Excel, but R still slows around 1MB and won't work around 3MB.

我在具有8GB RAM的计算机上使用32位Office软件和32位Java的64位Windows 7系统. 3MB的空间似乎并不大,而JVM的大约750 MB的空闲内存据推测是在导出开始时就存在的(用xlcMemoryReport检查).

I'm on a 64-bit Windows 7 system with 32-bit Office software and 32-bit Java on a machine with 8GB RAM. 3MB doesn't seem very big compared to the ~750 MB free memory in the JVM that is supposedly there at the beginning of export (checked with xlcMemoryReport).

想法?

推荐答案

给出3MB的参考值,我想得出的结论是,您正在尝试编写一个data.frame,其数字变量的尺寸为10列x 40k行(或类似值);这样的data.frame的object.size大约为3.2MB).

Given your reference value of 3MB I'm concluding you are trying to write a data.frame with numeric variables of dimension 10 columns x 40k rows (or comparable; the object.size of such a data.frame results in approx. 3.2MB).

取决于要尝试写入xls(BIFF8)还是xlsx(OOXML)文件,内存要求可能会大不相同.原因是xlsx文档实际上是压缩的XML文件,而Apache POI( XLConnect 使用的底层Java API)使用xmlbeans来处理这些文件-这可能会占用大量内存.另一方面,BIFF8是二进制数据格式,并且需要较少的内存.

Depending on if you are trying to write xls (BIFF8) or xlsx (OOXML) files, memory requirements can be quite different. Reason being that xlsx documents are actually compressed XML files and Apache POI (which is the underlying Java API that is used by XLConnect) uses xmlbeans to manipulate those - this can be quite memory intense. BIFF8 on the other hand is a binary data format and requires less memory.

您应该能够将最大尺寸的data.frame写入xlsx文档中.堆大小为1024m,即以下内容对我来说效果很好:

You should be able to write a data.frame of before mentioned dimensions to an xlsx document with a max. heap size of 1024m, i.e. the following worked fine for me:

options(java.parameters = "-Xmx1024m") # required BEFORE any JVM is initialized in R
require(XLConnect)
tmp = as.data.frame(matrix(rnorm(4e5), ncol = 10))
writeWorksheetToFile(tmp, file = "test.xlsx", sheet = "test")

...在RStudio,XLConnect 0.2-0和JRE 1.6.0_25上使用32位R 2.15.1(在具有4GB RAM的32位Windows XP上运行).

... using R 2.15.1 32-bit with RStudio, XLConnect 0.2-0 and JRE 1.6.0_25 (running on 32-bit Windows XP with 4GB of RAM).

对于那些对Apache POI侧上的内存使用情况更深入讨论感兴趣的人,可以进行以下讨论:

For those interested in a more in-depth discussion of memory usage on the Apache POI side there is the following discussion: http://apache-poi.1045710.n5.nabble.com/HSSF-and-XSSF-memory-usage-some-numbers-td4312784.html

这篇关于xlConnect R使用JVM内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆