从R data.table中除去多个列,并使用要除去的列的参数 [英] Removing multiple columns from R data.table with parameter for columns to remove

查看:324
本文介绍了从R data.table中除去多个列,并使用要除去的列的参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图以类似的方式操纵一些data.tables,并且想写一个函数来完成这个。我想传递一个参数,该参数包含将执行操作的列的列表。当列的向量声明为:=运算符的左侧时,这种方式工作正常,但如果先前声明(或传递到函数中),则不行。以下代码显示了问题。

  dt = data.table(a = letters,b = 1:2,c = 1 :13)
colsToDelete = c('b','c')
dt [,colsToDelete:= NULL]#不工作,但我不明白为什么不。
dt [,c('b','c'):= NULL]#工作正常,但不允许传递列

错误是添加新列colsToDelete',然后赋值NULL(删除它)。



当沿着这些行做事时会出现同样的问题

  dt [,colNames:= lapply(.SD,adjustValue,y = factor),.SDcols = colNames] 

我是R的新手,但更喜欢使用其他语言,所以这可能是一个愚蠢的问题。

:= 的LHS上的符号来添加新的列,为了方便起见:例如: DT [,col:= val] 。因此,为了区分 col 本身是来自存储在 col 中作为列名称的名称,我们检查如果LHS是名称表达式



如果是 name ,它会在LHS中添加具有这样名称的列,如果 expression 它得到计算。

  DT [,col:= val]#col是列名。 

DT [,(col):= val]#col获得计算并替换为其值
DT [,c(col):= val]#与上面相同

首选习语是: dt [,(colsToDelete):= NULL]



HTH


I'm trying to manipulate a number of data.tables in similar ways, and would like to write a function to accomplish this. I would like to pass in a parameter containing a list of columns that would have the operations performed. This works fine when the vector declaration of columns is the left hand side of the := operator, but not if it is declared earlier (or passed into the function). The follow code shows the issue.

dt = data.table(a = letters, b = 1:2, c=1:13)
colsToDelete = c('b', 'c')
dt[,colsToDelete := NULL] # doesn't work but I don't understand why not.
dt[,c('b', 'c') := NULL] # works fine, but doesn't allow passing in of columns

The error is "Adding new column 'colsToDelete' then assigning NULL (deleting it)." So clearly, it's interpreting 'colsToDelete' as a new column name.

The same issue occurs when doing something along these lines

dt[, colNames := lapply(.SD, adjustValue, y=factor), .SDcols = colNames]

I new to R, but rather more experienced with some other languages, so this may be a silly question.

解决方案

It's basically because we allow symbols on LHS of := to add new columns, for convenience: ex: DT[, col := val]. So, in order to distinguish col itself being the name from whatever is stored in col being the column names, we check if the LHS is a name or an expression.

If it's a name, it adds the column with the name as such on the LHS, and if expression, then it gets evaluated.

DT[, col := val] # col is the column name.

DT[, (col) := val]  # col gets evaluated and replaced with its value
DT[, c(col) := val] # same as above

The preferred idiom is: dt[, (colsToDelete) := NULL]

HTH

这篇关于从R data.table中除去多个列,并使用要除去的列的参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆