通过引用传递 data.frame 并使用 rcpp 更新它 [英] Passing by reference a data.frame and updating it with rcpp

查看:44
本文介绍了通过引用传递 data.frame 并使用 rcpp 更新它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查看图库中的 rcpp 文档和 Rcpp::DataFrame 我意识到我不知道如何通过引用修改 DataFrame.谷歌搜索了一下,我在 SO 上找到了这篇文章,在存档中找到了这篇文章.没有什么明显的,所以我怀疑我错过了一些重要的东西,比如它已经是这样的,因为"或它没有意义,因为".

looking at the rcpp documentation and Rcpp::DataFrame in the gallery I realized that I didn't know how to modify a DataFrame by reference. Googling a bit I found this post on SO and this post on the archive. There is nothing obvious so I suspect I miss something big like "It is already the case because" or "it does not make sense because".

我尝试了以下编译但传递给 updateDFByRefdata.frame 对象在 R 中保持不变

I tried the following which compiled but the data.frame object passed to updateDFByRef in R stayed untouched

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void updateDFByRef(DataFrame& df) {
    int N = df.nrows();
    NumericVector newCol(N,1.);
    df["newCol"] = newCol;
    return;
}

推荐答案

DataFrame::operator[] 的实现方式确实会导致复制:

The way DataFrame::operator[] is implemented indeed leeds to a copy when you do that:

df["newCol"] = newCol;

要做你想做的,你需要考虑什么是数据框,一个向量列表,具有某些属性.然后您可以通过复制向量(指针,而不是它们的内容)从原始数据中获取数据.

To do what you want, you need to consider what a data frame is, a list of vectors, with certain attributes. Then you can grab data from the original, by copying the vectors (the pointers, not their content).

这样的事情就可以了.这是一个多一点的工作,但不是那么难.

Something like this does it. It is a little more work, but not that hard.

// [[Rcpp::export]]
List updateDFByRef(DataFrame& df, std::string name) {
    int nr = df.nrows(), nc= df.size() ;
    NumericVector newCol(nr,1.);
    List out(nc+1) ;
    CharacterVector onames = df.attr("names") ;
    CharacterVector names( nc + 1 ) ;
    for( int i=0; i<nc; i++) {
        out[i] = df[i] ;
        names[i] = onames[i] ;
    }
    out[nc] = newCol ;
    names[nc] = name ;
    out.attr("class") = df.attr("class") ;
    out.attr("row.names") = df.attr("row.names") ;
    out.attr("names") = names ;
    return out ;
}

此方法存在相关问题.您的原始数据框和您创建的数据框共享相同的向量,因此可能会发生不好的事情.因此,仅当您知道自己在做什么时才使用它.

There are issues associated with this approach. Your original data frame and the one you created share the same vectors and so bad things can happen. So only use this if you know what you are doing.

这篇关于通过引用传递 data.frame 并使用 rcpp 更新它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆