如何找到包含向量B至少一个元素的单元格A的向量? [英] How to find the vectors of the cell A that contain at least one element of the vector B?
问题描述
如何查找A
的向量,其中至少包含向量B
的一个元素?
How to find the vectors of A
that contain at least one element of the vector B
?
示例:
A = {[2 5],[8 9 2],[33 77 4],[102 6],[10 66 17 7 8 11],[110 99],[1 4 3],[15 41 88]}
B = [5 77 41 66 7]
Result = {[2 5],[33 77 4],[10 66 17 7 8 11],[15 41 88]}
推荐答案
方法
带有arrayfun
和ismember
-
Result = A(arrayfun(@(n) any(ismember(B,A{n})),1:numel(A)))
或与arrayfun
和bsxfun
-
Result = A(arrayfun(@(n) any(any(bsxfun(@eq,B(:),A{n}),2)),1:numel(A)))
或与arrayfun
和setdiff
-
Result = A(arrayfun(@(n) numel(setdiff(B,A{n})) < numel(B),1:numel(A)))
或与arrayfun
和intersect
-
Result = A(arrayfun(@(n) ~isempty(intersect(B,A{n})),1:numel(A)))
这里也可以使用cellfun
,这样四个基于cellfun
的对应解决方案就可以像这样-
One could also use cellfun
here, such that the four counterpart cellfun
based solutions end up like these -
Result = A(cellfun(@(x) any(ismember(B,x)), A))
Result = A(cellfun(@(x) any(any(bsxfun(@eq,B(:),x),2)),A))
Result = A(cellfun(@(x) numel(setdiff(B,x)) < numel(B),A))
Result = A(cellfun(@(x) ~isempty(intersect(B,x)),A))
采用不同的路线[使用bsxfun
的屏蔽功能]
Taking different route [Using bsxfun
's masking capability]
通过将A
转换为2D数字数组,可以使解决方案 alot 矢量化,而不是使用本质上是循环方法的基于arrayfun
或cellfun
的方法.因此,这里的想法是要有一个2D
数组,其中行数是A
中的最大元素数,列数是A
中的单元格数.此数组的每一列将保存A
的每个单元格中的元素,而NaNs
将填充空白空间.
Rather than going into those arrayfun
or cellfun
based approaches that are essentially loopy approaches, one can make the solution alot vectorized by converting A
into a 2D numeric array. So, the idea here is to have a 2D
array in which number of rows is the maximum number of elements in A
and number of columns as number of cells in A
. Each column of this array would hold elements from each cell of A
and NaNs
would fill up the empty spaces.
采用这种方法的解决方案代码如下所示-
The solution code with such an approach would look like this -
lens = cellfun('length',A); %// number of elements in each cell of A
mask = bsxfun(@ge,lens,(1:max(lens))'); %//'# mask of valid places in the 2D array
A_arr = NaN(size(mask)); %//initialize 2D array in which A elements are to be put
A_arr(mask) = [A{:}]; %// put the elements from A
%// Find if any element from B is in any element along the row or dim-3
%// locations in A_arr. Then logically index into A with it for the final
%// cell array output
Result = A(any(any(bsxfun(@eq,A_arr,permute(B,[1 3 2])),1),3));
验证
>> celldisp(Result)
Result{1} =
2 5
Result{2} =
33 77 4
Result{3} =
10 66 17 7 8 11
Result{4} =
15 41 88
基准化
对于有兴趣查看运行时性能的人来说,这是一个快速的基准测试,具有足够大的数据大小-
Benchmarking
For people interested in seeing the runtime performances, here's a quick benchmarking test with a sufficiently huge datasize -
%// Create inputs
N = 10000; %// datasize
max_num_ele = 100; %// max elements in any cell of A
num_ele = randi(max_num_ele,N,1); %// number of elements in each cell of A
A = arrayfun(@(n) randperm(N,num_ele(n)), 1:N, 'uni', 0);
B = randperm(N,num_ele(1));
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
%// Start timing all approaches
disp('************************ With arrayfun **************************')
disp('------------------------ With arrayfun + ismember')
tic
Result = A(arrayfun(@(n) any(ismember(B,A{n})),1:numel(A)));
toc, clear Result
disp('------------------------ With arrayfun + bsxfun')
tic
Result = A(arrayfun(@(n) any(any(bsxfun(@eq,B(:),A{n}),2)),1:numel(A)));
toc, clear Result
disp('------------------------ With arrayfun + setdiff')
tic
Result = A(arrayfun(@(n) numel(setdiff(B,A{n})) < numel(B),1:numel(A)));
toc, clear Result
disp('------------------------ With arrayfun + intersect')
tic
Result = A(arrayfun(@(n) ~isempty(intersect(B,A{n})),1:numel(A)));
toc, clear Result
disp('************************ With cellfun **************************')
disp('------------------------ With cellfun + ismember')
tic
Result = A(cellfun(@(x)any(ismember(B,x)), A));
toc, clear Result
disp('------------------------ With cellfun + bsxfun')
tic
Result = A(cellfun(@(x) any(any(bsxfun(@eq,B(:),x),2)),A));
toc, clear Result
disp('------------------------ With cellfun + setdiff')
tic
Result = A(cellfun(@(x) numel(setdiff(B,x)) < numel(B),A));
toc, clear Result
disp('------------------------ With cellfun + setdiff')
tic
Result = A(cellfun(@(x) ~isempty(intersect(B,x)),A));
disp('************************ With masking bsxfun **************************')
tic
lens = cellfun('length',A); %// number of elements in each cell of A
mask = bsxfun(@ge,lens,(1:max(lens))'); %//'
A_numarr = NaN(size(mask));
A_numarr(mask) = [A{:}];
Result = A(any(any(bsxfun(@eq,A_numarr,permute(B,[1 3 2])),1),3));
toc
在我的系统上这样获得的结果是-
The results thus obtained on my system were -
************************ With arrayfun **************************
------------------------ With arrayfun + ismember
Elapsed time is 0.409810 seconds.
------------------------ With arrayfun + bsxfun
Elapsed time is 0.157327 seconds.
------------------------ With arrayfun + setdiff
Elapsed time is 1.154602 seconds.
------------------------ With arrayfun + intersect
Elapsed time is 1.081729 seconds.
************************ With cellfun **************************
------------------------ With cellfun + ismember
Elapsed time is 0.392375 seconds.
------------------------ With cellfun + bsxfun
Elapsed time is 0.143341 seconds.
------------------------ With cellfun + setdiff
Elapsed time is 1.101331 seconds.
------------------------ With cellfun + setdiff
************************ With masking bsxfun ********************
Elapsed time is 0.067224 seconds.
正如人们所看到的,基于cellfun
的解决方案比基于arrayfun
的解决方案要快一些!同样,基于遮罩的bsxfun
方法看起来很有趣,但是请记住,这是饥饿的自然之物.
As one can see, cellfun
based solutions are wee bit faster than their arrayfun
based counterparts! Also, mask-based bsxfun
approach looks like an interesting one, but do keep in mind it's memory-hungry nature.
这篇关于如何找到包含向量B至少一个元素的单元格A的向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!