当传递给函数时,不希望原始data.table被修改 [英] Don't want original data.table to be modified when passed to a function
问题描述
我对 data.table
很感兴趣,就像为所有当前和将来的需求编写可重用函数一样。
在处理这个问题的答案时遇到以下挑战:使用ggplot2自动绘制所有data.table列的最佳方式 我们将data.table传递给一个绘图功能,然后原始data.table被修改,即使我们做了一个副本,以防止这种情况。
下面是一个简单的代码来说明:
plotYofX < - 函数(.dt,x,y){
dt < - .dt
dt [,(c(x,y)):= lapply(.SD,function(x){as.numeric (x)}).SDcols = c(x,y)]
ggplot(dt)+ geom_step(aes(x = get(names(dt)[x]),y = get(names(dt) [y])))+ labs(x = names(dt)[x],y = names(dt)[y])
}
> dtDiamonds< - data.table(ggplot2 :: diamonds [2:5,1:3]);
> dt钻石
克拉切色
< num> < ORD> < ORD>
1:0.21溢价E
2:0.23好E
3:0.29溢价I
4:0.31好J
> plotYofX(dtDiamonds,1,2);
> dt钻石
克拉切色
< num> < NUM> < ORD>
1:0.21 4 E
2:0.23 2 E
3:0.29 4 I
4:0.31 2 J
在函数内部使用:=
有关的各种问题上,我见过很多贴子,但找不到任何帮助我解决这个看似非常简单的问题。 (当然,我不会将它转换回 data.frame
来达到理想的效果)。
感谢上面的评论/回答:这将是这个特定函数的最简单解决方案(即不需要引入任何额外的)。 dt
变量);
plotYofX < - function(dt,x,y){
dt [,lapply(.SD,function(x){as.numeric(x)}),.SDcols = c(x,y)]
ggplot(dt)+ geom_step(aes(x = get(names(dt)[x]),y = get(names(dt)[y]))+ labs(x = names(dt)[x],y = names(dt)[y])
}
然而,这也很重要要知道在使用 data.table
时,应该特别小心,不要使用常规的< - sign,但改为使用 copy(dt)
- 以免破坏原来的 data.table
!
这里将进一步详细讨论:准确了解data.table何时是对另一个data.table的引用(而不是副本)
I am a fan of data.table
, as of writing re-usable functions for all current and future needs.
Here's a challenge I run into while working on the answer to this problem: Best way to plot automatically all data.table columns using ggplot2
We pass data.table to a function for plotting and then the original data.table gets modified, even though we made a copy of it to prevent that.
Here's a simple code to illustrate:
plotYofX <- function(.dt,x,y) {
dt <- .dt
dt[, (c(x,y)) := lapply(.SD, function(x) {as.numeric(x)}), .SDcols = c(x,y)]
ggplot(dt) + geom_step(aes(x=get(names(dt)[x]), y=get(names(dt)[y]))) + labs(x=names(dt)[x], y=names(dt)[y])
}
> dtDiamonds <- data.table(ggplot2::diamonds[2:5,1:3]);
> dtDiamonds
carat cut color
<num> <ord> <ord>
1: 0.21 Premium E
2: 0.23 Good E
3: 0.29 Premium I
4: 0.31 Good J
> plotYofX(dtDiamonds,1,2);
> dtDiamonds
carat cut color
<num> <num> <ord>
1: 0.21 4 E
2: 0.23 2 E
3: 0.29 4 I
4: 0.31 2 J
I've seen many postings on various issues related to using :=
inside the function, but could not find any to help me to resolve this seemingly very easy issue. (Of course, I don't what to convert it back to data.frame
to achieve the desired outcome)
解决方案 Thanks to comments/answers above: this would be the easiest solution to this particular function (i.e. no need to introduce any additional .dt
variable at all);
plotYofX <- function(dt,x,y) {
dt[, lapply(.SD, function(x) {as.numeric(x)}), .SDcols = c(x,y)]
ggplot(dt) + geom_step(aes(x=get(names(dt)[x]), y=get(names(dt)[y]))) + labs(x=names(dt)[x], y=names(dt)[y])
}
However, it was also important to learn that when working with data.table
, one should be particularly careful in not making any "copies" of it with regular <-
sign, but use copy(dt)
instead - so that not corrupt the original data.table
!
This is further discussed in detail here: Understanding exactly when a data.table is a reference to (vs a copy of) another data.table
这篇关于当传递给函数时,不希望原始data.table被修改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!