R - 用 apply 系列中的函数替换双循环 [英] R - Replace a double loop by a function from the apply family
问题描述
我有这些循环:
xall = data.frame()
for (k in 1:nrow(VectClasses))
{
for (i in 1:nrow(VectIndVar))
{
xall[i,k] = sum(VectClasses[k,] == VectIndVar[i,])
}
}
数据:
VectClasses = 包含每个类的特征的数据框
VectClasses = Data Frame containing the characteristics of each classes
VectIndVar = 包含数据库每条记录的数据帧
VectIndVar = Data Frame containing each record of the data base
这两个 for 循环工作并提供我可以使用的输出,但是,它花费的时间太长,因此我需要应用系列
The two for loops work and give an output I can work with, however, it takes too long, hence my need for the apply family
我正在寻找的输出是这样的:
The output I am looking for is as this:
V1 V2 V3 V4
1 3 3 2 2
2 2 2 1 1
3 3 4 3 3
4 3 4 3 3
5 4 4 3 3
6 3 2 3 3
我尝试使用:
xball = data.frame()
xball = sapply(xball, function (i,k){
sum(VectClasses[k,] == VectIndVar[i,])})
xcall = data.frame()
xcall = lapply(xcall, function (i, k){sum(VectClasses[k,] == VectIndVar[i,]} )
但似乎都没有填充数据框
but neither seems to be filling the dataframe
可重现的数据(缩短):
reproductible data (shortened):
VectIndVar <- data.frame(a=sample(letters[1:5], 100, rep=T), b=floor(runif(100)*25),
c = sample(c(1:5), 100, rep=T),
d=sample(c(1:2), 100, rep=T))
和:
> K1 = 4
VectClasses= VectIndVar [sample(1:nrow(VectIndVar ), K1, replace=FALSE), ]
你能帮我吗?
推荐答案
我会使用 outer
而不是 *apply
:
I would use outer
instead of *apply
:
res <- outer(
1:nrow(VectIndVar),
1:nrow(VectClasses),
Vectorize(function(i,k) sum(VectIndVar[i,-1]==VectClasses[k,-1]))
)
(感谢此问答阐明需要Vectorize
.)
这给了
> head(res) # with set.seed(1) before creating the data
[,1] [,2] [,3] [,4]
[1,] 1 1 2 1
[2,] 0 0 1 0
[3,] 0 0 0 0
[4,] 0 0 1 0
[5,] 1 0 0 1
[6,] 1 1 1 1
<小时>
至于速度,我建议使用矩阵而不是 data.frames:
As for speed, I would suggest using matrices instead of data.frames:
cmat <- as.matrix(VectClasses[-1]); rownames(cmat)<-VectClasses$a
imat <- as.matrix(VectIndVar[-1]); rownames(imat)<-VectIndVar$a
这篇关于R - 用 apply 系列中的函数替换双循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!