如何在多个列中创建带有条件的单个虚拟变量? [英] How to Create a Single Dummy Variable with conditions in multiple columns?

查看:163
本文介绍了如何在多个列中创建带有条件的单个虚拟变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据数据集中的7个变量(col9-15)中的一个或多个取特定值(35),在我的数据集中有效地创建二进制虚拟变量(1/0) ,但我不想测试所有列.

I am trying to efficiently create a binary dummy variables (1/0) in my data set based on whether or not one or more of 7 variables (col9-15) in the data set take on a specific value (35), but I don't want to test all columns.

虽然通常as.numeric是理想的,但我一次只能使它与一列一起工作:

While as.numeric is ideal usually, I can only get it to work with one column at a time:

data$indicator <- as.numeric(data$col1 == 35)

有什么想法可以修改上面的代码,以便如果data$col9-data$col15中的任何一个为"35",那么我的指标变量取1?

Any idea how I can modify the above code so that if any of data$col9 - data$col15 are "35" then my indicator variable takes on a 1?

谢谢!

推荐答案

您可以像这样使用rowSums(矢量化解决方案):

You can use rowSums (vectorized solution) like this :

set.seed(123)
dat <- matrix(sample(c(35,1:100),size=15*20,rep=T),ncol=15,byrow=T)
cbind(dat,rowSums(dat[,9:15] == 35) > 0)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
 [1,]   29   79   41   89   94    4   53   90   55    46    96    45    68    57    10     0
 [2,]   90   24    4   33   96   89   69   64  100    66    71    54    60    29    14     0
 [3,]   97   91   69   80    2   48   76   21   32    23    14    41    41    37    15     0
 [4,]   14   23   47   26   86    4   44   80   12    56    20    12    76    90    37     0
 [5,]   67    9   38   27   82   45   81   82   80    44    76    63    71    35    48     1
 [6,]   22   38   61   35   11   24   67   42   79    10    43    99    90    89    17     0
 [7,]   13   65   34   66   32   18   79    9   47    51    60    33    49    96    48     0
 [8,]   89   92   61   41   14   94   30    6   95    72    14    55    96    59    40     0
 [9,]   65   32   31   22   37   99   15    9   14    69    62    90    67    74    52     0
[10,]   66   83   79   98   44   31   41    1   18    85    23    24     7    24    73     0
[11,]   85   50   39   24   11   39   57   21   44    22    50    35    65    37    35     1
[12,]   53   74   22   41   26   63   18   87   75    67    62    37    53    88    58     0
[13,]   84   31   71   26   60   48   26   57   92    91    27    32    99    62    94     0
[14,]   47   41   66   15   57   24   97   60   52    40    88    36    29    17    17     0
[15,]   48   25   21   68    4   70   35   41   82    92    28    97    73    69     5     0
[16,]   39   48   56   70   92   62   43   54    5    26    40    19    84    15    81     0
[17,]   55   66   17   63   31   73   40   97   97    73    25    22    59    27    53     0
[18,]   79   16   40   47   87   93   89   68   95    52    58    33    35     2    50     1
[19,]   87   35    7   16   77   74   98   47    7    65    76    13    40    22     5     0
[20,]   39    6   22    5   67   30   10    7   88    76    82    99    10    10    80     0

编辑

我将cbind替换为transform.由于该列将是布尔值,因此我将其强制为0/1.

I replace the cbind by transform. Since the column will be boolean I coerce it to get 0/1.

 transform(dat,x=as.numeric((rowSums(dat[,9:15] == 35) > 0)))

结果是一个data.frame.(通过转换从矩阵强制转换)

The result is a data.frame.( coerced from matrix by transform)

EDIT2 (由@flodel建议)

EDIT2 ( as suggested by @flodel)

data$indicator <- as.integer(rowSums(data[paste0("col", 9:15)] == 35) > 0)

其中data是OP的data.frame.

where data is the OP's data.frame.

这篇关于如何在多个列中创建带有条件的单个虚拟变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆