使用 python 和 matplotlib 获取箱线图中使用的值 [英] Obtaining values used in boxplot, using python and matplotlib

查看:56
本文介绍了使用 python 和 matplotlib 获取箱线图中使用的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以从数据中绘制箱线图:

I can draw a boxplot from data:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.rand(100)
plt.boxplot(data)

然后,框的范围将从 25th-percentile 到 75th-percentile,胡须的范围将从最小值到最大值 (25th-percentile - 1.5*IQR, 75th-percentile + 1.5*IQR),其中 IQR 表示四分位距.(当然,值 1.5 是可以自定义的).

Then, the box will range from the 25th-percentile to 75th-percentile, and the whisker will range from the smallest value to the largest value between (25th-percentile - 1.5*IQR, 75th-percentile + 1.5*IQR), where the IQR denotes the inter-quartile range. (Of course, the value 1.5 is customizable).

现在我想知道箱线图中使用的值,即中位数、上四分位数和下四分位数、上须线端点和下须线端点.虽然前三个很容易通过使用 np.median()np.percentile() 获得,但胡须的终点需要一些冗长的编码:

Now I want to know the values used in the boxplot, i.e. the median, upper and lower quartile, the upper whisker end point and the lower whisker end point. While the former three are easy to obtain by using np.median() and np.percentile(), the end point of the whiskers will require some verbose coding:

median = np.median(data)
upper_quartile = np.percentile(data, 75)
lower_quartile = np.percentile(data, 25)

iqr = upper_quartile - lower_quartile
upper_whisker = data[data<=upper_quartile+1.5*iqr].max()
lower_whisker = data[data>=lower_quartile-1.5*iqr].min()

我想知道,虽然这是可以接受的,但有没有更简洁的方法来做到这一点?似乎这些值应该准备好从箱线图中提取出来了,因为它已经被绘制出来了.

I was wondering, while this is acceptable, would there be a neater way to do this? It seems that the values should be ready to pull-out from the boxplot, as it's already drawn.

推荐答案

为什么要这样做?你在做什么已经很直接了.

Why do you want to do so? what you are doing is already pretty direct.

是的,如果你想为绘图获取它们,当绘图已经完成时,只需使用 get_ydata() 方法.

Yeah, if you want to fetch them for the plot, when the plot is already made, simply use the get_ydata() method.

B = plt.boxplot(data)
[item.get_ydata() for item in B['whiskers']]

它为每个须返回一个形状为 (2,) 的数组,第二个元素是我们想要的值:

It returns an array of the shape (2,) for each whiskers, the second element is the value we want:

[item.get_ydata()[1] for item in B['whiskers']]

这篇关于使用 python 和 matplotlib 获取箱线图中使用的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆