python pandas:如何从在其他系列中具有匹配值的系列中获取值的索引? [英] python pandas : How to get the index of the values from a series that have matching values in other series?

查看:64
本文介绍了python pandas:如何从在其他系列中具有匹配值的系列中获取值的索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这两个系列:

In [48]: serie1
Out[48]: 
0    A
1    B
2    C
3    A
4    D

In [49]: serie2
Out[49]: 
0    X
1    Y
2    A
3    Z
4    A
5    D
dtype: object

对于 serie1 中的每个值,我想从 serie2 中获取索引/索引.如果不迭代值,这可能吗?一个可能的解决方案是或多或少地像这样构建一个数据框:

And for each value in serie1 I want to get the index/indexes from serie2. Is this possible without iterating over values? A possible solution would be to build a dataframe more or less like this:

       A      B      C      D
X    False  False  False  False                 
Y    False  False  False  False
A    True   False  False  False
Z    False  False  False  False
A    True   False  False  False
D    False  False  False  True

...然后获取每列真"值的索引

... and then get the index of the "True" values for each column

推荐答案

1) 对于布尔匹配表:如果你想要一个交叉表(只显示唯一值,没有重复),然后将其转换为布尔值:

1) For the boolean table of matches: if you want a crosstabulation (only shows unique values, no repetitions), and then convert it to boolean:

serie1 = pd.Series(['A','B','C','A','D'])
serie2 = pd.Series(['X','Y','A','Z','A','D'])

pd.crosstab(serie2,serie1) > 0 

col_0      A      B      C      D
row_0                            
A      False  False   True   True
X       True  False  False  False
Y      False   True  False  False
Z       True  False  False  False

(请注意,行索引自动按值排序,而不是值在 serie1 中出现的顺序.您可以通过玩 .reorder_levels 来覆盖它(...))

(Note that the row-index is automatically sorted by value, so not the order in which the values appear in serie1. You may be able to override that by playing with .reorder_levels(...))

2) 至于匹配索引,将它们作为数组的字典......

2) As for the indices of matches, to get them as a dict of arrays...

serie2.groupby(serie1).indices

{'A': array([0, 3]), 'C': array([2]), 'B': array([1]), 'D': array([4])}

# ... or as a list of arrays...
serie2.groupby(serie1).indices.values()

[array([0, 3]), array([2]), array([1]), array([4])]

# Here are alternatives with list comprehensions which are probably less efficient than `Series.groupby()` 
>>> [ np.flatnonzero(serie2.apply(lambda i2: i2==i1)) for i1 in serie1 ]
[array([2, 4]), array([], dtype=int64), array([], dtype=int64), array([2, 4]), array([5])]

>>> [ np.flatnonzero(serie2.apply(lambda i2: i2==i1)).tolist() for i1 in serie1 ]
[[2, 4], [], [], [2, 4], [5]]

这篇关于python pandas:如何从在其他系列中具有匹配值的系列中获取值的索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆