使用 C 外部指针的 R 内存泄漏 [英] R memory leak using C external pointers
问题描述
我正在尝试在包中使用外部指针,但我遇到了一个问题,似乎没有调用终结器并且内存泄漏.
I'm trying to use external pointers in a package, but I ran into an issue where it seemed like the finalizer was not being called and memory leaked.
下面是一个非常人为的问题示例:
Below is an extremely contrived example of the issue:
#include <Rcpp.h>
using namespace Rcpp;
void finalize(SEXP xp){
delete static_cast< std::vector<double> *>(R_ExternalPtrAddr(xp));
}
// [[Rcpp::export]]
SEXP ext_ref_ex() {
std::vector<double> * x = new std::vector<double>(1000000);
SEXP xp = PROTECT(R_MakeExternalPtr(x, R_NilValue, R_NilValue));
R_RegisterCFinalizer(xp, finalize);
UNPROTECT(1);
return xp;
}
右面:
library(Rcpp)
sourceCpp("tests.cpp")
# breaks and/or crashes
for(i in 1:10000) {
z <- ext_ref_ex()
}
# no issue
for(i in 1:10000) {
z <- ext_ref_ex()
rm(z)
gc()
}
运行第一个循环,R 最终在足够的迭代后出现段错误(问题 #1),而预期的行为是应该清理数据并且不应该有段错误.
Running the first loop, R eventually segfaults after enough iterations (issue #1), whereas the expected behavior is that the data should be cleaned up and there should be no segfault.
问题#2 是如果您中断进程并调用gc()
,有时内存会被清除,但通常不会.根据 htop
报告,R 使用 60-70% 的内存,即使在 rm(list=ls())
和 gc()
之后>.
Issue #2 is that if you interrupt the process and call gc()
, sometimes the memory will be cleared but usually it won't. Based on the htop
report, R uses 60-70% of the memory, even after rm(list=ls())
and gc()
.
第二个循环没有明显的内存问题.
The second loop experiences no apparent memory issues.
我在 C 方面做错了吗?我遇到了错误吗?
Am I doing something wrong on the C side? Am I running into a bug?
(Windows 上的 R 版本 3.5.2 ubuntu 18.04.)
(R version 3.5.2 ubuntu 18.04 on Windows.)
推荐答案
即使使用 Rcpp 而不是 C API 来创建外部指针和注册终结器,我也可以重现该问题:
I can reproduce the issue even when using Rcpp instead of the C API for creating the external pointer and registering the finalizer:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::XPtr< std::vector<double> > ext_ref_ex() {
std::vector<double> * x = new std::vector<double>(1000000);
Rcpp::XPtr< std::vector<double> > xp(x, true) ;
return xp;
}
对我来说,只要在循环中包含 gc()
就足以解决问题:
For me, just including gc()
in the loop is enough to fix the issue:
for (i in 1:10000) {
z <- ext_ref_ex()
gc() # crash without this line
}
因此,终结器未运行"似乎不是问题,而是垃圾收集未运行"的问题.我的解释:您为 vector
分配了大量内存,为外部指针分配了少量内存.R 只知道外部指针.因此,如果这超出范围,R 就看不到运行垃圾收集的理由.
So it seems not an issue with "finalizer not running" but with "garbage collection not running". My interpretation: You are allocating a lot of memory for the vector
and a little memory for the external pointer. R knows only about the external pointer. So if that goes out of scope, R does not see a reason to run the garbage collection.
这篇关于使用 C 外部指针的 R 内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!