在python中使用numpy和scipy在matplotlib中制作装箱的箱线图 [英] making binned boxplot in matplotlib with numpy and scipy in Python

查看:359
本文介绍了在python中使用numpy和scipy在matplotlib中制作装箱的箱线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含成对值的二维数组,我想通过x值的不同bin绘制y值的箱线图. IE.如果数组是:

I have a 2-d array containing pairs of values and I'd like to make a boxplot of the y-values by different bins of the x-values. I.e. if the array is:

my_array = array([[1, 40.5], [4.5, 60], ...]])

然后我要对my_array [:, 0]进行装箱,然后为每个箱生成对应的my_array [:, 1]值的箱线图,该箱图落入每个箱中.因此,最后,我希望该图包含箱数-许多箱形图.

then I'd like to bin my_array[:, 0] and then for each of the bins, produce a boxplot of the corresponding my_array[:, 1] values that fall into each box. So in the end I want the plot to contain number of bins-many box plots.

我尝试了以下操作:

min_x = min(my_array[:, 0])
max_x = max(my_array[:, 1])

num_bins = 3
bins = linspace(min_x, max_x, num_bins)
elts_to_bins = digitize(my_array[:, 0], bins)

但是,这给了我elts_to_bins范围从1到3的值.我认为我应该为垃圾箱获取基于0的索引,而我只想要3个垃圾箱.我假设这是由于在linspace与数字化中如何表示垃圾箱而有些棘手.

However, this gives me values in elts_to_bins that range from 1 to 3. I thought I should get 0-based indices for the bins, and I only wanted 3 bins. I'm assuming this is due to some trickyness with how bins are represented in linspace vs. digitize.

最简单的方法是什么?我想要num_bins-许多等距的bin,第一个bin包含数据的下半部分,而上bin包含数据的上半部分……即,我希望每个数据点都落入某个bin中,这样我就可以制作一个箱线图.

What is the easiest way to achieve this? I want num_bins-many equally spaced bins, with the first bin containing the lower half of the data and the upper bin containing the upper half... i.e., I want each data point to fall into some bin, so that I can make a boxplot.

谢谢.

推荐答案

Numpy具有专用函数,用于以您需要的方式创建直方图:

Numpy has a dedicated function for creating histograms the way you need to:

histogram(a, bins=10, range=None, normed=False, weights=None, new=None)

您可以像这样使用

(hist_data, bin_edges) = histogram(my_array[:,0], weights=my_array[:,1])

此处的关键点是使用weights参数:每个值a[i]都会对直方图贡献weights[i].示例:

The key point here is to use the weights argument: each value a[i] will contribute weights[i] to the histogram. Example:

a = [0, 1]
weights = [10, 2]

在x = 0时描述10点,在x = 1时描述2点.

describes 10 points at x = 0 and 2 points at x = 1.

您可以使用bins参数设置垃圾箱的数量或垃圾箱限制(请参阅

You can set the number of bins, or the bin limits, with the bins argument (see the official documentation for more details).

然后可以使用以下类似的方式绘制直方图:

The histogram can then be plotted with something like:

bar(bin_edges[:-1], hist_data)

如果您只需要进行直方图 plot ,则类似的 hist()函数可以直接绘制直方图:

If you only need to do a histogram plot, the similar hist() function can directly plot the histogram:

hist(my_array[:,0], weights=my_array[:,1])

这篇关于在python中使用numpy和scipy在matplotlib中制作装箱的箱线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆