如何从存储在 Pandas 系列中的后续 numpy 数组中选择元素 [英] How to select elements from subsequent numpy arrays stored in pandas series

查看:46
本文介绍了如何从存储在 Pandas 系列中的后续 numpy 数组中选择元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一系列 numpy 数组:

I've got a Series of numpy arrays:

import pandas as pd
import numpy as np
pd.Series({10: np.array([[0.72260683, 0.27739317, 0.        ],
                         [0.7187053 , 0.2812947 , 0.        ],
                         [0.71435467, 0.28564533, 1.        ],
                         [0.3268072 , 0.6731928 , 0.        ],
                         [0.31941951, 0.68058049, 1.        ],
                         [0.31260015, 0.68739985, 0.        ]]), 
           20: np.array([[0.7022099 , 0.2977901 , 0.        ],
                         [0.6983866 , 0.3016134 , 0.        ],
                         [0.69411673, 0.30588327, 1.        ],
                         [0.33857735, 0.66142265, 0.        ],
                         [0.33244109, 0.66755891, 1.        ],
                         [0.32675582, 0.67324418, 0.        ]]), 
           38: np.array([[0.68811957, 0.31188043, 0.        ],
                         [0.68425783, 0.31574217, 0.        ],
                         [0.67994496, 0.32005504, 1.        ],
                         [0.34872593, 0.65127407, 0.        ],
                         [0.34276171, 0.65723829, 1.        ],
                         [0.33722803, 0.66277197, 0.        ]])}
)

和索引数组 np.array([1, 4, 1]) 指示应从连续数组中过滤哪些行.预期的输出是这样的:

and an array of indices np.array([1, 4, 1]) indicating which rows should be filtered from consecutive arrays. The expected output would be like this:

pd.Series({10: np.array([[0.7187053 , 0.2812947 , 0.        ]]), 
           20: np.array([[0.33244109, 0.66755891, 1.        ]]), 
           38: np.array([[0.68425783, 0.31574217, 0.        ]])}
)

我该怎么做?如果我想从获得以下系列的每个结果数组中取出第三个元素,会有什么不同?

How can I do that? How it would be different if I would like to take out third element from each resulting array obtaining the following Series?

pd.Series({10: 0, 20: 1, 30: 0})

推荐答案

如果可能,将一系列 2d 数组转换为每个 2d 数组的 3d 数组(相同长度):

If possible convert Series of 2d arrays to 3d array (same length)s of each 2d arrays:

a = np.array([1, 4, 1])

b = np.array(s.tolist())[np.arange(len(s)), a, 2]
print (b)
[0. 1. 0.]

c = pd.Series(b, index=s.index)
print (c)
10    0.0
20    1.0
38    0.0
dtype: float64

如果要按索引数组选择:

If want select by array of indices:

b1 = np.array(s.tolist())[np.arange(len(s)), a]
print (b1)

[[0.7187053  0.2812947  0.        ]
 [0.33244109 0.66755891 1.        ]
 [0.68425783 0.31574217 0.        ]]

这篇关于如何从存储在 Pandas 系列中的后续 numpy 数组中选择元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆