向量化的“in”功能在朱莉亚? [英] Vectorized "in" function in julia?

查看:103
本文介绍了向量化的“in”功能在朱莉亚?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常想循环一个数据帧的长数组或列,对于每个项目,看它是否是另一个数组的成员。而不是做

  giant_list = [a,c,j] 
good_letters = [一个,b]
isin = falses(size(giant_list,1))
for i = 1:size(giant_list,1)
isin [i] = giant_list [i] in good_letters
end

有没有矢量化(双向矢量化)?在朱莉亚?类似于基本运算符,我想做一些像

  isin = giant_list .in good_letters 

我意识到这可能是不可能的,但我只是想确保我没有丢失任何东西。我知道我可以使用DataStructures中的DefaultDict来做类似的操作,但是根本不知道什么。

解决方案

indexin 功能与您想要的类似:


indexin(a,b)



返回一个包含属于 b a 中的每个值的 b 中的最高索引c $ c>。输出向量包含0,而 a 不是 b 的成员。


< blockquote>

既然你想为你的 giant_list 中的每个元素都要一个布尔值(而不是中的索引)good_letters ),你可以简单地做:

  julia> indexin(giant_list,good_letters)。> 0 
3元素BitArray {1}:
true
false
false

执行 indexin 非常简单,并指出了如何在不关心 b


$ $ $ $ $ $ $ $ $ $ $ $ $

$ b $我在一个b

只有有限的一组名称可能是用作中缀运算符,因此不可能将其用作中缀运算符。


I often want to loop over a long array or column of a dataframe, and for each item, see if it is a member of another array. Rather than doing

giant_list = ["a", "c", "j"]
good_letters = ["a", "b"]
isin = falses(size(giant_list,1))
for i=1:size(giant_list,1)
    isin[i] = giant_list[i] in good_letters
end

Is there any vectorized (doubly-vectorized?) way to do this in julia? In analogy with the basic operators I want to do something like

isin = giant_list .in good_letters

I realize this may not be possible, but I just wanted to make sure I wasn't missing something. I know I could probably use DefaultDict from DataStructures to do the similar but don't know of anything in base.

解决方案

The indexin function does something similar to what you want:

indexin(a, b)

Returns a vector containing the highest index in b for each value in a that is a member of b. The output vector contains 0 wherever a is not a member of b.

Since you want a boolean for each element in your giant_list (instead of the index in good_letters), you can simply do:

julia> indexin(giant_list, good_letters) .> 0
3-element BitArray{1}:
  true
 false
 false

The implementation of indexin is very straightforward, and points the way to how you might optimize this if you don't care about the indices in b:

function vectorin(a, b)
    bset = Set(b)
    [i in bset for i in a]
end

Only a limited set of names may be used as infix operators, so it's not possible to use it as an infix operator.

这篇关于向量化的“in”功能在朱莉亚?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆