将值分配给特定的data.table列和行 [英] Assign value to specific data.table columns and rows

查看:78
本文介绍了将值分配给特定的data.table列和行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

仍然了解这个很棒的程序包...谁能解释这个错误的原因吗?

still understanding this great package... Could anyone please explain me the reason of this error? Thanks!

library(data.table)

DT <- data.table(id   = LETTERS,
                 var1 = rnorm(26),
                 var2 = rnorm(26))

> DT[2, list(var1, var2)]
            var1          var2
1: -0.8628479332 -0.2367492928
> DT[2, c(var1, var2)]
[1] -0.8628479332 -0.2367492928
> 
> DT[2, list(var1, var2)] <- DT[8, list(var1, var2)]
Error in `[<-.data.table`(`*tmp*`, 2, list(var1, var2), value = list(var1 = -0.394006912428776,  : 
  object 'var1' not found
> DT[2, c(var1, var2)] <- DT[8, c(var1, var2)]
Error in `[<-.data.table`(`*tmp*`, 2, c(var1, var2), value = c(-0.394006912428776,  : 
  object 'var1' not found


推荐答案

首先,它建议使用:= 代替 [<-来提高效率。 [< ;-主要是为了向后保持一致而提供的,因此,我将首先说明如何有效地使用:= 来获得想要的东西。 := 是按引用分配的(它更新data.table而不复制数据,因此极其非常快)。

First, it is recommended to use := instead of [<- for efficiency. The [<- is mostly provided for backward consistency. So, I'll first illustrate how to efficiently use := to get what you're after. := is assignment by reference (and it updates a data.table without copying the data, therefore extremely fast).

require(data.table)
DT <- data.table(x = 1:5, y = 6:10, z = 11:15)

假设您要更改第二行y的第五行到y的第五行:

Suppose you want to change the 2nd row of "y" to that of 5th row of "y":

DT[2, y := DT[5, y]] 

或等效地

DT[2, `:=`(y = DT[5, y])]

假设您要将 y和 z的第二行更改为第5行中相应条目的行,然后:

Suppose you want to change the 2nd row of both "y" and "z" to that of the corresponding entries in row 5, then:

DT[2, c("y", "z") := as.list(DT[5, c(y, z)])]

或等效地

DT[2, `:=`(y = DT[5, y], z = DT[5, z])]






现在仅向您展示如何使用 [<-(同时显然不建议这样做),可以按照以下步骤进行操作:


Now just to show you how to assign using [<- (while it is clearly not recommended), it can be done as follows:

DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
DT[1, c("y", "z")] <- as.list(DT[5, c(y, z)])

或等效地,您还可以传递列号:

or equivalently, you can also pass the column number:

DT[1, 2:3] <- as.list(DT[5, c(y, z)])

希望这会有所帮助。

首先,如果要分配的栏目超过1列,则RHS必须是 [<-data.table 的列表。

First, the RHS has to be a list for [<-data.table if it has more than 1 columns to be assigned to.

第二个, <-左侧的 j 自变量在您的data.table环境中。因此,它需要知道 j 的值是什么。而且由于您提供了 var1 var2 不带双引号,字符向量),可以理解为变量。因此,它检查变量 var1 var2 ,但是由于它没有看到您的列data.table作为变量(就像您通常在<-的RHS上进行赋值等操作时一样),它将在其父环境中查找相同的变量在全局环境中找不到它们,因此您会得到错误。例如:

Second, j argument on the left of <- is not evaluated within the environment of your data.table. So, it needs to know what the values for j are. And since you provide var1 and var2 (without the double quotes that would make them a character vector), it is understood to be a variable. And so, it checks for variables var1 and var2, but since it doesn't "see" the columns within your data.table as variables (like it normally does when you do assignments etc on the RHS of <-), it'll look for the same variables in its parent environment which is the global environment where it doesn't find them and so you get the error. For ex: do this:

y <- "y"
z <- "z"
# And now try your second case: 
DT[2, c(y, z)] <- as.list(DT[5, c(y, z)])
# the left side takes values from the assignments you made above
# the right side y and z are evaluated within the environment of your data.table
# and so it sees the columns y and z as variables and their values are picked accordingly

第三, [< -data.table 函数仅接受 j 参数的 atomic (矢量)类型。因此,您的第一次分配 DT [2,list(var1,var2)]<-DT [8,list(var1,var2)] 仍然会给出错误用正确的方法做到这一点,即:

Third, the [<-data.table function accepts only atomic (vector) types for j argument. So, your first assignment DT[2, list(var1, var2)] <- DT[8, list(var1, var2)] will still give an error if you do it the right way, that is:

y <- "y"
z <- "z"
DT[2, list(y, z)] <- as.list(DT[5, c(y, z)])

# Error in `[<-.data.table`(`*tmp*`, 2, list(y, z), value = list(10L, 15L)) : 
#   j must be atomic vector, see ?is.atomic

希望这会有所帮助。

DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
tracemem(DT)
# [1] "<0x7fbefb89b580>"

DT[1, c("y", "z") := list(100L, 110L)]
tracemem(DT)
# [1] "<0x7fbefb89b580>"

DT[2, c("y", "z")] <- list(200L, 201L)
# tracemem[0x7fbefacc4fa0 -> 0x7fbefd297838]: # copied, inefficient

这篇关于将值分配给特定的data.table列和行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆