R数据表。多列重新编码/子分配 [英] R data.table multi column recode/sub-assign

查看：154 发布时间：2017/3/12 11:10:45 r data.table na multiple-columns recode

本文介绍了R数据表。多列重新编码/子分配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

让DT为数据表：

DT<-data.table(V1=sample(10),
               V2=sample(10),
               ...
               V9=sample(10),)

$ b b

有这样更好/更简单的方法来做多列recode / sub-assign：

Is there a better/simpler method to do multicolumn recode/sub-assign like this:

DT[V1==1 | V1==7,V1:=NA]
DT[V2==1 | V2==7,V2:=NA]
DT[V3==1 | V3==7,V3:=NA]
DT[V4==1 | V4==7,V4:=NA]
DT[V5==1 | V5==7,V5:=NA]
DT[V6==1 | V6==7,V6:=NA]
DT[V7==1 | V7==7,V7:=NA]
DT[V8==1 | V8==7,V8:=NA]
DT[V9==1 | V9==7,V9:=NA]

变量名称是完全任意的，。
许多列（Vx：Vx）和一个重新编码模式（NAME == 1 | NAME == 7，NAME：= something）。

Variable names are completely arbitrary and do not necessarily have numbers. Many columns (Vx:Vx) and one recode pattern for all (NAME==1 | NAME==7, NAME:=something).

进一步，如何多列分配NA的东西。例如data.frame样式：

And further, how to multicolumn subassign NA's to something else. E.g in data.frame style:

data[,columns][is.na(data[,columns])] <- a_value

推荐答案

您可以使用 set 用于替换多个列中的值。基于？set ，它是快速的，因为避免了 [。data.table ]的开销。我们使用 for 循环来覆盖列，并将'i'和'j'索引的值替换为'NA'

You could use set for replacing values in multiple columns. Based on the ?set, it is fast as the overhead of [.data.table is avoided. We use a for loop to loop over the columns and replace the values that were indexed by the 'i' and 'j' with 'NA'

 for(j in seq_along(DT)) {
      set(DT, i=which(DT[[j]] %in% c(1,7)), j=j, value=NA)
  }

<包括@David Arenburg的评论。

Included @David Arenburg's comments.

set.seed(24)
DT<-data.table(V1=sample(10), V2= sample(10), V3= sample(10))

这篇关于R数据表。多列重新编码/子分配的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R数据表。多列重新编码/子分配 [英] R data.table multi column recode/sub-assign

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R数据表。多列重新编码/子分配 [英] R data.table multi column recode/sub-assign

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭