在列表中获取匹配索引的快捷方法 [英] Fast way of getting index of match in list

查看:133
本文介绍了在列表中获取匹配索引的快捷方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个列表 a 包含不等长度的向量和向量 b ,其中包含<的向量中的一些元素code> a ,我想得到一个长度等于 b 的向量,其中包含中的索引a b 中的元素匹配(我知道这是一个不好的解释)......

Given a list a containing vectors of unequal length and a vector b containing some elements from the vectors in a, I want to get a vector of equal length to b containing the index in a where the element in b matches (this is a bad explanation I know)...

以下代码完成工作:

a <- list(1:3, 4:5, 6:9)
b <- c(2, 3, 5, 8)

sapply(b, function(x, list) which(unlist(lapply(list, function(y, z) z %in% y, z=x))), list=a)
[1] 1 1 2 3

用for循环替换 sapply 当然会达到同样的效果

Replacing the sapply with a for loop achieves the same of course

问题是此代码将与长度大于1000的列表和向量一起使用。在现实生活中,该函数大约需要15秒(for循环和 sapply )。

The problem is that this code will be used with list and vectors with a length above 1000. On a real life set the function takes around 15 seconds (both the for loop and the sapply).

有没有人知道如何加快速度,对平行安全我接近?我没有看到矢量化的方法(我不能在C中编程,尽管这可能是最快的)。

Does anyone have an idea how to speed this up, safe for a parallel approach? I have failed to see a vectorized approach (and I cannot program in C, though that would probably be the fastest).

编辑:

将使用match()强调Aaron优雅的解决方案,其速度提升1667次(从15到0.009)

Will just emphasize Aaron's elegant solution using match() which gave a speed increase in the order of 1667 times (from 15 to 0.009)

我在它上面扩展了一下以允许多个匹配(返回是一个列表)

I expanded a bit on it to allow multiple matches (the return is then a list)

a <- list(1:3, 3:5, 3:7)
b <- c(3, 5)
g <- rep(seq_along(a), sapply(a, length))
sapply(b, function(x) g[which(unlist(a) %in% x)])
[[1]]
[1] 1 2 3

[[2]]
[1] 2 3

运行时间这是0.169,这可能相当慢,但另一方面更灵活

The runtime for this was 0.169 which is arguably quite slower, but on the other hand more flexible

推荐答案

这是使用匹配:

> a <- list(1:3, 4:5, 6:9)
> b <- c(2, 3, 5, 8)
> g <- rep(seq_along(a), sapply(a, length))
> g[match(b, unlist(a))]
[1] 1 1 2 3

findInterval 是另一种选择:

> findInterval(match(b, unlist(a)), cumsum(c(0,sapply(a, length)))+1)
[1] 1 1 2 3

要返回列表,请尝试以下操作:

For returning a list, try this:

a <- list(1:3, 4:5, 5:9)
b <- c(2,3,5,8,5)
g <- rep(seq_along(a), sapply(a, length))
aa <- unlist(a)
au <- unique(aa)
af <- factor(aa, levels=au)
gg <- split(g, af)
gg[match(b, au)]

这篇关于在列表中获取匹配索引的快捷方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆