获取箱形图的数据-Matplotlib [英] Getting data of a box plot - Matplotlib

查看:297
本文介绍了获取箱形图的数据-Matplotlib的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须绘制一些数据的箱形图,使用 Matplotlib 可以轻松地完成。但是,我被要求提供一张表格,其中包含晶须,中位数,标准偏差等数据。

I have to plot a boxplot of some data, which I could easily do with Matplotlib. However, I was requested to provide a table with the data presented there, like the whiskers, the medians, standard deviation, and so on.

我知道我可以计算这些手工,但从参考文献中我也知道 boxplot 方法:

I know that I could calculate these "by hand", but I also know, from the reference, that the boxplot method:

Returns a dictionary mapping each component of the boxplot to a list of the matplotlib.lines.Line2D instances created. That dictionary has the following keys (assuming vertical boxplots):

boxes: the main body of the boxplot showing the quartiles and the median’s confidence intervals if enabled.
medians: horizonal lines at the median of each box.
whiskers: the vertical lines extending to the most extreme, n-outlier data points.
caps: the horizontal lines at the ends of the whiskers.
fliers: points representing data that extend beyone the whiskers (outliers).

所以我想知道如何获得这些值,因为它们是 matplotlib.lines .Line2D

So I'm wondering how could I get these values, since they are matplotlib.lines.Line2D.

谢谢。

推荐答案

如您所知,您需要访问boxplot返回值的成员。

As you've figured out, you need to access the members of the return value of boxplot.

例如,如果您的返回值存储在 bp

Namely, e.g. if your return value is stored in bp

bp['medians'][0].get_ydata()

>> array([ 2.5,  2.5])

由于箱线图是垂直的,而中线因此是一条水平线,您只需关注y值之一即可;即我的样本数据的中位数为2.5。

As the boxplot is vertical, and the median line is therefore a horizontal line, you only need to focus on one of the y-values; i.e. the median is 2.5 for my sample data.

对于字典中的每个键,该值将是一个列表,用于处理多个框。如果只有一个箱线图,则列表将只有一个元素,因此我在上面使用 bp ['medians'] [0]
如果箱图中有多个方框,则需要使用来迭代它们。

For each "key" in the dictionary, the value will be a list to handle for multiple boxes. If you have just one boxplot, the list will only have one element, hence my use of bp['medians'][0] above. If you have multiple boxes in your boxplot, you will need to iterate over them using e.g.

for medline in bp['medians']:
    linedata = medline.get_ydata()
    median = linedata[0]

CT朱的答案很不幸,因为不同的元素表现不同。也例如只有一个中位数,但有两个晶须...因此,如上所述,手动处理每个数量是最安全的。

CT Zhu's answer doesn't work unfortunately, as the different elements behave differently. Also e.g. there's only one median, but two whiskers...therefore it's safest to manually treat each quantity as outlined above.

NB以下是您可以到达的最接近的位置;

NB the closest you can come is the following;

res  = {}
for key, value in bp.items():
    res[key] = [v.get_data() for v in value]

或等效地

res = {key : [v.get_data() for v in value] for key, value in bp.items()}

这篇关于获取箱形图的数据-Matplotlib的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆