如何在不使用R的循环的情况下编码此指标矩阵 [英] How can I code this indicator matrix without using a loop in R

查看:98
本文介绍了如何在不使用R的循环的情况下编码此指标矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由数字序列给定的因子向量.这些因素也可以在称为test_settrain_set的单独数据集中找到.以下代码执行的操作是找到数据集中的因子在因子向量中的匹配位置,并将1放置在矩阵的位置.将此矩阵compound_test乘以test_set$Compound应该会得到compare_comp.

I have a vector of factors given by a sequence of numbers. These factors are also found in separate data seta, called test_set and train_set. What the following code does is find where the factor in the data sets matches in the vector of factors and puts a 1 in the place of the matrix. Multiplying this matrix compound_test by test_set$Compound should give you compare_comp.

test_set <- data.frame(Compound=letters[sample(1:3,10,replace = TRUE)])
train_set <- data.frame(Compound=letters[sample(1:3,10,replace = TRUE)])

compare_comp <- letters[1:3]
compound_test <- matrix(0,nrow(test_set),length(compare_comp)) # test indicator matrix
compound_train <-matrix(0,nrow(train_set),length(compare_comp))

for (i in 1:length(compare_comp)){
  compound_test[which(compare_comp[i]==test_set$Compound),i]=1
  compound_train[which(compare_comp[i]==train_set$Compound),i]=1}

R中是否有一个函数可以让我创建相同的东西而无需for循环?我已经尝试过model.matrix(~Compound,data=test_set),但是由于参考级别的原因,它不包括列,并且还会产生不需要的列名

Is there a function in R that lets me create the same thing without the need for a for loop? I have tried model.matrix(~Compound,data=test_set) but this does not include a column due to the reference level and also produces unwanted column names

推荐答案

更简单的选择是model.matrix来自base R

model.matrix(~ Compound-1, train_set)
model.matrix(~ Compound-1, test_set)


如果我们cbind具有一系列行,则

或者table也可以使用


Or table can also be used if we cbind with a sequence of rows

table(cbind(nr = seq_len(nrow(train_set)), train_set))

这篇关于如何在不使用R的循环的情况下编码此指标矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆