使用 XML 包解决 R 内存泄漏 [英] Workaround to R memory leak with XML package

查看:23
本文介绍了使用 XML 包解决 R 内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行一些简单的程序来从 html 代码中提取表格.但是,XML 包中的 readHTMLTable 似乎存在一些内存问题.有什么办法可以轻松解决这个问题.就像以某种方式为此命令指定一些特殊内存,然后手动释放它.

I am trying to run some simple program to extract tables from html code. However, there seems to be some memory issue with readHTMLTable in XML package. Is there any way I could just work around this easily. Like somehow specifying some special memory for this command and then freeing it manually.

我试图把它放在一个函数中,并尝试使用 gc() 和不同版本的 R 和这个包,但似乎没有任何效果.我开始绝望了.

I have tried to put this in a function and tried to use gc() and different versions of R and this package and nothing seems to work. I start to get desperate.

示例代码.如何在不爆炸内存大小的情况下运行它?

Example code. How to run this without exploding memory size?

library(XML)
a = readLines("http://en.wikipedia.org/wiki/2014_FIFA_World_Cup")
while(TRUE) {
    b = readHTMLTable(a)
    #do something with b
}

像这样的事情仍然占据了我所有的记忆:

Something like this still takes all of my memory:

library(XML)
a = readLines("http://en.wikipedia.org/wiki/2014_FIFA_World_Cup")
f <- function(x) {
    b = readHTMLTable(x)
    rm(x)
    gc()
    return(b)
}

for(i in 1:100) {
    d = f(a)
    rm(d)
    gc()
}
rm(list=ls())
gc()

我使用的是 win 7 并尝试了 32 位和 64 位.

I am using win 7 and tried with 32bit and 64bit.

推荐答案

从 XML 3.98-1.4 和 Win7 上的 R 3.1 开始,使用 free() 函数可以完美解决这个问题.但它不适用于 readHTMLTable().以下代码完美运行.

As of XML 3.98-1.4 and R 3.1 on Win7, this problem can be solved perfectly by using the function free(). But it does not work with readHTMLTable(). The following code works perfectly.

library(XML)
a = readLines("http://en.wikipedia.org/wiki/2014_FIFA_World_Cup")
while(TRUE){
   b = xmlParse(paste(a, collapse = ""))
   #do something with b
   free(b)
}

xml2 包也有类似问题,可以使用remove_xml() 后接gc() 函数释放内存.

The xml2 package has similar issues and the memory can be released by using the function remove_xml() followed by gc().

这篇关于使用 XML 包解决 R 内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆