当传递给函数时,不希望原始data.table被修改 [英] Don't want original data.table to be modified when passed to a function

查看:124
本文介绍了当传递给函数时,不希望原始data.table被修改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 data.table 很感兴趣,就像为所有当前和将来的需求编写可重用函数一样。



在处理这个问题的答案时遇到以下挑战:使用ggplot2自动绘制所有data.table列的最佳方式 我们将data.table传递给一个绘图功能,然后原始data.table被修改,即使我们做了一个副本,以防止这种情况。



下面是一个简单的代码来说明:

  plotYofX < - 函数(.dt,x,y){
dt < - .dt
dt [,(c(x,y)):= lapply(.SD,function(x){as.numeric (x)}).SDcols = c(x,y)]
ggplot(dt)+ geom_step(aes(x = get(names(dt)[x]),y = get(names(dt) [y])))+ labs(x = names(dt)[x],y = names(dt)[y])
}


> dtDiamonds< - data.table(ggplot2 :: diamonds [2:5,1:3]);
> dt钻石
克拉切色
< num> < ORD> < ORD>
1:0.21溢价E
2:0.23好E
3:0.29溢价I
4:0.31好J

> plotYofX(dtDiamonds,1,2);
> dt钻石
克拉切色
< num> < NUM> < ORD>
1:0.21 4 E
2:0.23 2 E
3:0.29 4 I
4:0.31 2 J

在函数内部使用:= 有关的各种问题上,我见过很多贴子,但找不到任何帮助我解决这个看似非常简单的问题。 (当然,我不会将它转换回 data.frame 来达到理想的效果)。

解决方案

感谢上面的评论/回答:这将是这个特定函数的最简单解决方案(即不需要引入任何额外的)。 dt 变量);

  plotYofX < -  function(dt,x,y){ 
dt [,lapply(.SD,function(x){as.numeric(x)}),.SDcols = c(x,y)]
ggplot(dt)+ geom_step(aes(x = get(names(dt)[x]),y = get(names(dt)[y]))+ labs(x = names(dt)[x],y = names(dt)[y])

}

然而,这也很重要要知道在使用 data.table 时,应该特别小心,不要使用常规的< - sign,但改为使用 copy(dt) - 以免破坏原来的 data.table

这里将进一步详细讨论:准确了解data.table何时是对另一个data.table的引用(而不是副本)


I am a fan of data.table, as of writing re-usable functions for all current and future needs.

Here's a challenge I run into while working on the answer to this problem: Best way to plot automatically all data.table columns using ggplot2

We pass data.table to a function for plotting and then the original data.table gets modified, even though we made a copy of it to prevent that.

Here's a simple code to illustrate:

plotYofX <- function(.dt,x,y) {
  dt <- .dt
  dt[, (c(x,y)) := lapply(.SD, function(x) {as.numeric(x)}), .SDcols = c(x,y)]
  ggplot(dt) + geom_step(aes(x=get(names(dt)[x]), y=get(names(dt)[y]))) + labs(x=names(dt)[x], y=names(dt)[y])
}


> dtDiamonds <- data.table(ggplot2::diamonds[2:5,1:3]); 
> dtDiamonds
   carat     cut color
   <num>   <ord> <ord>
1:  0.21 Premium     E
2:  0.23    Good     E
3:  0.29 Premium     I
4:  0.31    Good     J

> plotYofX(dtDiamonds,1,2); 
> dtDiamonds
    carat   cut color
    <num> <num> <ord>
1:  0.21     4     E
2:  0.23     2     E
3:  0.29     4     I
4:  0.31     2     J

I've seen many postings on various issues related to using := inside the function, but could not find any to help me to resolve this seemingly very easy issue. (Of course, I don't what to convert it back to data.frame to achieve the desired outcome)

解决方案

Thanks to comments/answers above: this would be the easiest solution to this particular function (i.e. no need to introduce any additional .dt variable at all);

plotYofX <- function(dt,x,y) {
  dt[,  lapply(.SD, function(x) {as.numeric(x)}), .SDcols = c(x,y)]
  ggplot(dt) + geom_step(aes(x=get(names(dt)[x]), y=get(names(dt)[y]))) + labs(x=names(dt)[x], y=names(dt)[y]) 

}

However, it was also important to learn that when working with data.table, one should be particularly careful in not making any "copies" of it with regular <- sign, but use copy(dt) instead - so that not corrupt the original data.table!
This is further discussed in detail here: Understanding exactly when a data.table is a reference to (vs a copy of) another data.table

这篇关于当传递给函数时,不希望原始data.table被修改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆