从 pandas DataFrame绘制带有误差线和数据点的折线图 [英] Plotting a line plot with error bars and datapoints from a pandas DataFrame

查看:176
本文介绍了从 pandas DataFrame绘制带有误差线和数据点的折线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在绞尽脑汁,试图弄清楚如何以我想要的方式绘制熊猫DataFrame,但无济于事.

I've been racking my brain to try to figure out how to plot a pandas DataFrame the way I want but to no avail.

DataFrame具有一个MultiIndex,它看起来像这样:

The DataFrame has a MultiIndex and it looks like this:

+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
|           |              |            |              |                 | run_001 | run_002 | run_003 | run_004 | run_005 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| file_type | server_count | file_count | thread_count | cacheclear_type |         |         |         |         |         |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| gor       | 01servers    | 05files    | 20threads    | ccALWAYS        | 15.918  | 16.275  | 15.807  | 17.781  | 16.233  |
| gor       | 01servers    | 10files    | 20threads    | ccALWAYS        | 17.322  | 17.636  | 16.096  | 16.484  | 16.715  |
| gor       | 01servers    | 15files    | 20threads    | ccALWAYS        | 19.265  | 17.128  | 17.630  | 18.739  | 16.833  |
| gor       | 01servers    | 20files    | 20threads    | ccALWAYS        | 23.744  | 20.539  | 21.416  | 22.921  | 22.794  |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+

我要做的是绘制一个折线图,其中x值为'file_count'值,每个y值均为DataFrame中相应行的所有run_xxx值的平均值.

What I want to do is plot a line graph where the x values are the 'file_count' value, and the y value for each is the average of all the run_xxx values for the corresponding line in the DataFrame.

如果可能的话,我想添加误差线,甚至是数据点本身,以便可以看到该平均值后面的数据分布.

If possible I would like to add error bars and even the data points themselves so that I can see the distribution of the data behind that average.

这是我正在谈论的((脚的)样机:

Here's a (crappy) mockup of roughly what I'm talking about:

通过以下操作,我已经能够使用内置在熊猫的DataFrame中的boxplot()函数创建一个boxplot:

I've been able to create a boxplot using the boxplot() function built into pandas' DataFrame by doing:

df.transpose().boxplot()

这看起来几乎可以,但是有点混乱,没有绘制实际的数据点.

This looks almost okay but a little bit cluttered and doesn't have the actual data points plotted.

推荐答案

Beeswarm plot will very nice in this situation, especially when you have a lot of dots and what to show the distributions of those dots. You need to, however, supply the position parameter to beeswarm as by default it will started at 0. The the boxplot method of pandas DataFrame, on the other hand, plots boxes at x = 1, 2 ...

归结为这些:

from beeswarm import *
D1 = beeswarm(df.values, positions = np.arange(len(df.values))+1)
D2 = df.transpose().boxplot(ax=D1[1])

这篇关于从 pandas DataFrame绘制带有误差线和数据点的折线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆