数据表1.8.1:“DT1 = DT2”与DT1 = copy(DT2)?不一样? [英] data.table 1.8.1.: "DT1 = DT2" is not the same as DT1 = copy(DT2)?

查看:320
本文介绍了数据表1.8.1:“DT1 = DT2”与DT1 = copy(DT2)?不一样?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用不同的赋值运算符时,我注意到data.table中的一些不一致(与我不一致)的行为。我不得不承认,我从来没有得到=和copy()之间的区别,所以也许我们可以在这里散发一些光。如果使用=或< - 代替copy(),在更改复制的data.table时,原始的data.table也会改变。



请执行以下命令,您将看到我的意思

  library(data.table)
示例data.table)

DT
xyv
1:a 1 42
2:a 3 42
3:a 6 42
4: b 1 4
5:b 3 5
6:b 6 6
7:c 1 7
8:c 3 8
9:c 6 9

DT2 = DT

现在我将更改DT2的v列: / p>

  DT2 [,v:= 3L] 
xyv
1:a 1 3
2: a 3 3
3:a 6 3
4:b 1 3
5:b 3 3
6:b 6 3
7:c 1 3
8:c 3 3
9:c 6 3

DT:

  DT 
xyv
1:a 1 3
2:a 3 3
3:a 6 3
4:b 1 3
5:b 3 3
6:b 6 3
7:c 1 3
8: c 3 3
9:c 6 3

它也改变了。
所以:改变DT2改变了原来的DT。不是所以如果我使用copy():

  example(data.table)#reset DT 
DT3<复制(DT)
DT3 [,v:= 3L]
xyv
1:a 1 3
2:a 3 3
3:a 6 3
4:b 1 3
5:b 3 3
6:b 6 3
7:c 1 3
8:c 3 3
9:c 6 3

DT
xyv
1:a 1 42
2:a 3 42
3:a 6 42
4:b 1 4
5:b 3 5
6:b 6 6
7:c 1 7
8:c 3 8
9:c 6 9



这是预期的行为吗?

解决方案

是的。



由于 data.table 使用对原始对象的引用来实现修改

由于这个原因,如果真的要复制数据,您需要使用

code> copy(DT)






c>?copy :


data.table通过引用修改,并返回(不可见) b $ b它可以在复合语句中使用;例如 setkey(DT,a)[J(foo)]
如果您需要复制,请首先复制(使用 DT2 =复制(DT))。 := 用于子分配给$ b $ copy() b列作为参考。请参阅?copy


a href =http://stackoverflow.com/questions/10225098/understanding-exactly-when-a-data-table-is-a-reference-to-vs-a-copy-of-another>了解何时data.table是对另一个的副本的引用


I've noticed some inconsistent (inconsistent to me) behaviour in data.table when using different assignment operators. I have to admit I never quite got the difference between "=" and copy(), so maybe we can shed some light here. If you use "=" or "<-" instead of copy() below, upon changing the copied data.table, the original data.table will change as well.

Please execute the following commands and you will see what I mean

library(data.table)
example(data.table)

DT
   x y  v
1: a 1 42
2: a 3 42
3: a 6 42
4: b 1  4
5: b 3  5
6: b 6  6
7: c 1  7
8: c 3  8
9: c 6  9

DT2 = DT

now i'll change the v column of DT2:

DT2[ ,v:=3L]
   x y  v
1: a 1  3
2: a 3  3
3: a 6  3
4: b 1  3
5: b 3  3
6: b 6  3
7: c 1  3
8: c 3  3
9: c 6  3

but look what happened to DT:

DT
   x y  v
1: a 1  3
2: a 3  3
3: a 6  3
4: b 1  3
5: b 3  3
6: b 6  3
7: c 1  3
8: c 3  3
9: c 6  3

it changed as well. so: changing DT2 changed the original DT. not so if I use copy():

example(data.table)  # reset DT
DT3 <- copy(DT)
DT3[, v:= 3L]
   x y  v
1: a 1  3
2: a 3  3
3: a 6  3
4: b 1  3
5: b 3  3
6: b 6  3
7: c 1  3
8: c 3  3
9: c 6  3

DT
   x y  v
1: a 1 42
2: a 3 42
3: a 6 42
4: b 1  4
5: b 3  5
6: b 6  6
7: c 1  7
8: c 3  8
9: c 6  9

is this behaviour expected?

解决方案

Yes. This is expected behaviour, and well documented.

Since data.table uses references to the original object to achieve modify-in-place, it is very fast.

For this reason, if you really want to copy the data, you need to use copy(DT)


From the documentation for ?copy:

The data.table is modified by reference, and returned (invisibly) so it can be used in compound statements; e.g., setkey(DT,a)[J("foo")]. If you require a copy, take a copy first (using DT2=copy(DT)). copy() may also sometimes be useful before := is used to subassign to a column by reference. See ?copy.

See also this question : Understanding exactly when a data.table is a reference to vs a copy of another

这篇关于数据表1.8.1:“DT1 = DT2”与DT1 = copy(DT2)?不一样?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆