使用要删除的列的参数从 R data.table 中删除多列 [英] Removing multiple columns from R data.table with parameter for columns to remove

查看:16
本文介绍了使用要删除的列的参数从 R data.table 中删除多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试以类似的方式操作多个 data.tables,并想编写一个函数来完成此操作.我想传入一个参数,其中包含将执行操作的列列表.当列的向量声明位于 := 运算符的左侧时,这可以正常工作,但如果之前声明(或传递给函数)则不行.以下代码显示了问题.

I'm trying to manipulate a number of data.tables in similar ways, and would like to write a function to accomplish this. I would like to pass in a parameter containing a list of columns that would have the operations performed. This works fine when the vector declaration of columns is the left hand side of the := operator, but not if it is declared earlier (or passed into the function). The follow code shows the issue.

dt = data.table(a = letters, b = 1:2, c=1:13)
colsToDelete = c('b', 'c')
dt[,colsToDelete := NULL] # doesn't work but I don't understand why not.
dt[,c('b', 'c') := NULL] # works fine, but doesn't allow passing in of columns

错误是添加新列 'colsToDelete' 然后分配 NULL(删除它)."很明显,它将colsToDelete"解释为一个新的列名.

The error is "Adding new column 'colsToDelete' then assigning NULL (deleting it)." So clearly, it's interpreting 'colsToDelete' as a new column name.

按照这些思路进行操作时会出现同样的问题

The same issue occurs when doing something along these lines

dt[, colNames := lapply(.SD, adjustValue, y=factor), .SDcols = colNames]

我是 R 新手,但对其他一些语言更有经验,所以这可能是一个愚蠢的问题.

I new to R, but rather more experienced with some other languages, so this may be a silly question.

推荐答案

这基本上是因为我们允许 := 的 LHS 上的符号添加新列,为方便起见:例如:DT[, col := val].因此,为了区分 col 本身是名称与存储在 col 中的任何内容是列名,我们检查 LHS 是否是 name表达式.

It's basically because we allow symbols on LHS of := to add new columns, for convenience: ex: DT[, col := val]. So, in order to distinguish col itself being the name from whatever is stored in col being the column names, we check if the LHS is a name or an expression.

如果它是 name,它会在 LHS 上添加具有相同名称的列,如果是 expression,那么它会被评估.

If it's a name, it adds the column with the name as such on the LHS, and if expression, then it gets evaluated.

DT[, col := val] # col is the column name.

DT[, (col) := val]  # col gets evaluated and replaced with its value
DT[, c(col) := val] # same as above

首选的习惯用法是:dt[, (colsToDelete) := NULL]

HTH

这篇关于使用要删除的列的参数从 R data.table 中删除多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆