交叉数组对的Python bin集 [英] Python bin sets of pairs of interleaving arrays

查看:111
本文介绍了交叉数组对的Python bin集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组成对的numpy数组.一对中的每个阵列的长度相同,但是不同对中的阵列具有不同的长度.此集合中的一对数组的示例是:

I have a set of pairs of numpy arrays. Each array in a pair is the same length, but arrays in different pairs have different lengths. An example of a pair of arrays from this set is:

Time: [5,8,12,17,100,121,136,156,200]
Score: [3,4,5,-10,-90,-80,-70,-40,10]

另一对是:

Time: [6,7,9,15,199]
Score: [5,6,7,-11,-130]

我需要根据时间对所有这些对进行平均(或执行合并).即,时间应划分为10个间隔,并且每个间隔的相应得分都需要平均.

I need to take an average (or perform binning) of all of these pairs based on the time. i.e. the time should be divided into intervals of 10 and the corresponding score(s) for each interval need to be averaged.

因此,对于以上2对,我想要以下结果:

Thus, for the above 2 pairs, I want the following result:

Time: [1-10,11-20,21-30,31-40,41-50,...,191-200]
Score: [(3+4+5+6+7)/5, (5-10-11)/2, ...]

我该怎么做?是否有比将所有东西单独装箱然后取平均值的更简单的方法?如何根据另一个阵列的仓位对一个阵列进行仓位?即对于单个数组,如何将时间数组分为10个间隔,然后使用此结果以一致的方式对相应的得分数组进行归档?

How can I do this? Is there a simpler way to do this than bin everything individually and then take the average? How do you bin an array based on the bins of another array? i.e. for an individual pair of arrays, how can I bin the time array into intervals of 10 and then use this result to bin the corresponding score array in a consistent manner?

推荐答案

您可以使用scipy.stats.binned_statistic.这是直方图函数的一般化.直方图将空间划分为bin,然后返回每个bin中的点数的 count .此功能允许计算每个bin内的值(或一组值)的,均值,中位数或其他统计量.

You can use scipy.stats.binned_statistic. This is a generalization of a histogram function. A histogram divides the space into bins, and returns the count of the number of points in each bin. This function allows the computation of the sum, mean, median, or other statistic of the values (or set of values) within each bin.

from scipy import stats
import numpy as np

T1 = [5,8,12,17,100,121,136,156,200]
S1 = [3,4,5,-10,-90,-80,-70,-40,10]

T2 = [6,7,9,15,199]
S2 = [5,6,7,-11,-130]

# Merging all Times and Scores in order
Time = T1 + T2
Score = S1 + S2

output = stats.binned_statistic(Time, Score, statistic='mean',range=(0,200), bins=20)

averages = output[0]

# For empty bins, it generates NaN, we can replace them with 0
print( np.nan_to_num(averages, 0) )

# Output of this code: 
# [  5.          -5.33333333   0.           0.           0.
#    0.           0.           0.           0.           0.
#  -90.           0.         -80.         -70.           0.
#  -40.           0.           0.           0.         -60.        ]

有关更多信息,请访问此链接.

For more information follow this link.

这篇关于交叉数组对的Python bin集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆