使用要删除的列的参数从 R data.table 中删除多列 [英] Removing multiple columns from R data.table with parameter for columns to remove
问题描述
我正在尝试以类似的方式操作多个 data.tables,并想编写一个函数来完成此操作.我想传入一个参数,其中包含将执行操作的列列表.当列的向量声明位于 := 运算符的左侧时,这可以正常工作,但如果之前声明(或传递给函数)则不行.以下代码显示了问题.
I'm trying to manipulate a number of data.tables in similar ways, and would like to write a function to accomplish this. I would like to pass in a parameter containing a list of columns that would have the operations performed. This works fine when the vector declaration of columns is the left hand side of the := operator, but not if it is declared earlier (or passed into the function). The follow code shows the issue.
dt = data.table(a = letters, b = 1:2, c=1:13)
colsToDelete = c('b', 'c')
dt[,colsToDelete := NULL] # doesn't work but I don't understand why not.
dt[,c('b', 'c') := NULL] # works fine, but doesn't allow passing in of columns
错误是添加新列 'colsToDelete' 然后分配 NULL(删除它)."很明显,它将colsToDelete"解释为一个新的列名.
The error is "Adding new column 'colsToDelete' then assigning NULL (deleting it)." So clearly, it's interpreting 'colsToDelete' as a new column name.
按照这些思路进行操作时会出现同样的问题
The same issue occurs when doing something along these lines
dt[, colNames := lapply(.SD, adjustValue, y=factor), .SDcols = colNames]
我是 R 新手,但对其他一些语言更有经验,所以这可能是一个愚蠢的问题.
I new to R, but rather more experienced with some other languages, so this may be a silly question.
推荐答案
这基本上是因为我们允许 :=
的 LHS 上的符号添加新列,为方便起见:例如:DT[, col := val]
.因此,为了区分 col
本身是名称与存储在 col
中的任何内容是列名,我们检查 LHS 是否是 name
或表达式
.
It's basically because we allow symbols on LHS of :=
to add new columns, for convenience: ex: DT[, col := val]
. So, in order to distinguish col
itself being the name from whatever is stored in col
being the column names, we check if the LHS is a name
or an expression
.
如果它是 name
,它会在 LHS 上添加具有相同名称的列,如果是 expression
,那么它会被评估.
If it's a name
, it adds the column with the name as such on the LHS, and if expression
, then it gets evaluated.
DT[, col := val] # col is the column name.
DT[, (col) := val] # col gets evaluated and replaced with its value
DT[, c(col) := val] # same as above
首选的习惯用法是:dt[, (colsToDelete) := NULL]
HTH
这篇关于使用要删除的列的参数从 R data.table 中删除多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!