如何在data.table中按名称删除列? [英] How do you delete a column by name in data.table?

查看:298
本文介绍了如何在data.table中按名称删除列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要移除 data.frame 中名为foo的列,我可以这样做:



df <-df [-grep('foo',colnames(df))]



一旦 df 被转换为 data.table 对象,则无法只删除列。



示例:

  df<  -  data.frame(id = 1: 100,foo = rnorm(100))
df2 < - df [-grep('foo',colnames(df))]#works
df3< - data.table(df)
df3 [-grep('foo',colnames(df3))]

转换为 data.table 对象,这不再工作。

解决方案

以下任何操作都会从数据中删除列 foo df3

 #方法1即使在20GB的data.table)
df3 [,foo:= NULL]

df3 [,c(foo,bar):= NULL]#删除两列

myVar =foo
df3 [,(myVar):= NULL]#lookup myVar contents

#方法2a - 可能是多个)
#列匹配regex
df3 [,grep(^ foo $,colnames(df3)):= NULL]

#方法2b - 替代2a,在下面的意义上也是安全的
df3 [,which(grepl(^ foo $,colnames(df3))):= NULL]



data.table 也支持以下语法:

  ##方法3(然后可以分配给df3,
df3 [,!foo,with = FALSE]

虽然如果你实际上想从 df3 删除foo code>(与只打印 df3 减列foo的视图相反)



(请注意,如果你使用的方法依赖于 grep() grepl(),您需要设置 pattern =^ foo $ foobuffoon c $ c>(ie包含 foo 作为子字符串的那些)也可以匹配和删除。)



use:



接下来的两个成语也会起作用 - 如果 df3 c $ c>foo - 但如果没有,可能会以意外的方式失败。例如,如果你使用它们中的任何一个来搜索不存在的列bar,你将得到一个零行data.table。 / p>

因此,它们真的最适合于交互式使用,例如,希望显示一个data.table减去任何包含子字符foo。对于编程目的(或者如果你想从 df3 而不是从它的副本中实际删除列),方法1,2a和2b真的最佳选项。

 #方法4a:
df3 [,-grep(^ foo $,colnames df3)),with = FALSE]

#方法4b:
df3 [,!grepl(^ foo $,colnames(df3)),with = FALSE]


To get rid of a column named "foo" in a data.frame, I can do:

df <- df[-grep('foo', colnames(df))]

However, once df is converted to a data.table object, there is no way to just remove a column.

Example:

df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))] 

But once it is converted to a data.table object, this no longer works.

解决方案

Any of the following will remove column foo from the data.table df3:

# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]

df3[, c("foo","bar"):=NULL]  # remove two columns

myVar = "foo"
df3[, (myVar):=NULL]   # lookup myVar contents

# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]

# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]

data.table also supports the following syntax:

## Method 3 (could then assign to df3, 
df3[, !"foo", with=FALSE]  

though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.

(Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)

Less safe options, fine for interactive use:

The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.

As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.

# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]

# Method 4b: 
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]

这篇关于如何在data.table中按名称删除列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆