基于多索引 pandas 数据框的matplotlib绘制错误栏 [英] plot errorbar with matplotlib based on multiindex pandas dataframe

查看:75
本文介绍了基于多索引 pandas 数据框的matplotlib绘制错误栏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在熊猫中有以下数据框:

>>>name Hour trt_level 压力日期值0 D43 9 H 控制 2019-06-07 0.45611 D43 10 H 控制 2019-06-07 0.32162 D42 8 M 应力 2019-06-07 0.21433 D42 9 M 应力 2019-06-07 0.13424 D21 8 L应力2019-06-07 0.3214...

我想用错误栏创建折线图,使用 mse/std,看起来像这样:

来自:

当我想要的结果应该在三行顶部带有std时(最好是MES而不是std,但是对于这个问题,我将重点更多地放在三行和std的显示上)

我的最终目标是获得更像这样的图表(对不起,抽奖很抱歉):

但是所有的时间

解决方案

在附近.您必须拆开多索引数据框.

将 numpy 导入为 np将熊猫作为pd导入从 matplotlib 导入 pyplot 作为 plt#我的测试文件每个条件至少包含两个值来计算 SD 值#df = pd.read_csv("test.txt", sep = "\s{2,}")dfm = df.groupby(["trt_level","Hour"]).agg([np.mean,np.std])dfm["value"].unstack(level=0).plot(y = "mean", yerr = "std", title = "TRT 级别真的很重要!", color = list(";rg"))plt.show()

样本输出

顺便说一句:kind=line" 不必指定,它是默认值.熊猫文档列出了所有 可能的关键字>kind.

I have the following dataframe in pandas:

>>>name   Hour   trt_level    stress   date          value
0  D43    9         H         control  2019-06-07    0.4561
1  D43    10        H         control  2019-06-07    0.3216
2  D42    8         M         stress   2019-06-07    0.2143
3  D42    9         M         stress   2019-06-07    0.1342
4  D21    8         L         stress   2019-06-07    0.3214
...

I want to create line chart with error-bar,with mse/std, something that will look like this:

from : https://matplotlib.org/1.2.1/examples/pylab_examples/errorbar_demo.htmlbut in my case: the X-axis should be hour, the y axis the values, and three lines, one for each level of treatment (trt_level) so line for H,M,L.

In order to do that I have used function groupby and agg :

data = df.groupby(['trt_level','Hour']).agg([np.mean, np.std])
data.head()

>>>                value
                   mean      std
trt_level  Hour   
H           7      0.231      0.0058
            8      0.212      0.0094
            9      0.431      0.1154
...


wwhich gav eme database with the treamtnet and hour as index and mean and std of the value, but the problem is that when I try to plot it I get only one line without the std on top:

data = data['value'] 
qual.plot(kind = "line", y = "mean", legend = False,  
          xerr = "std", title = "test", color='green')

When my desired result should have three lines with the std on top (better if could be MES and not std but for this question I focus more on the three lines and the displaying of the std)

My end goal is to get chart that is more like this (sorry for the horrible draw):

but for all the hours

解决方案

Nearly there. You have to unstack your multi-index dataframe.

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

#My test file contained at least two values per condition to calculate an SD value
#df = pd.read_csv("test.txt", sep = "\s{2,}") 

dfm = df.groupby(["trt_level","Hour"]).agg([np.mean, np.std])

dfm["value"].unstack(level=0).plot(y = "mean", yerr = "std", title = "TRT levels are really important!", color = list("rbg"))

plt.show()

Sample output

BTW: kind="line" does not have to be specified, it is the default. The pandas documentation lists all possible keywords for kind.

这篇关于基于多索引 pandas 数据框的matplotlib绘制错误栏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆