Seaborn:使用不对称的自定义误差线按组制作条形图 [英] Seaborn: Making barplot by group with asymmetrical custom error bars

查看:46
本文介绍了Seaborn:使用不对称的自定义误差线按组制作条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Pandas数据框,其中有几个类似下面的组列.

  gr1 grp2变量lb m ubA A1 V1 1.00 1.50 2.5A A2 V2 1.50 2.50 3.5B A1 V1 3.50 14.50 30.5B A2 V2 0.25 0.75 1.0 

我正在尝试使用 FacetGrid variables 中的每个变量获取一个单独的子栏.我正在尝试构建最终所需的图,如下图所示.

这是我到目前为止所拥有的.

  g = sns.FacetGrid(df,col ="variables",hue ="grp1")g.map(sns.barplot,'grp2','m',order = times) 

但是不幸的是,这正在堆积我所有的数据点.

我应该如何使用 Seaborn 来做到这一点?

更新:以下代码在很大程度上满足了我的要求,但目前不显示 yerr .

  g = sns.factorplot(x ="Grp2",y ="m",hue ="Grp1",col ="variables",data = df,kind ="bar",size = 4,Aspect = .7,sharey = False) 

我如何将 lb ub 合并为因子图上的误差线?

解决方案

在我们开始之前,我先说一下matplotlib要求错误是相对于数据而不是绝对边界的.因此,我们将通过减去相应的列来修改数据框以解决这一问题.

  u = u""grp1 grp2变量lb m ubA A1 V1 1.00 1.50 2.5A A2 V2 1.50 2.50 3.5B A1 V1 7.50 14.50 20.5B A2 V2 0.25 0.75 1.0A A2 V1 1.00 6.50 8.5A A1 V2 1.50 3.50 6.5B A2 V1 3.50 4.50 15.5B A1 V2 8.25 12.75 13.9"导入io将熊猫作为pd导入df = pd.read_csv(io.StringIO(u),delim_whitespace = True)#错误必须与数据有关(不是绝对界限)df ["lb"] = df ["m"] -df ["lb"]df ["ub"] = df ["ub"] -df ["m"] 

现在有两种解决方案,它们基本上是相同的.让我们从不使用seaborn的解决方案开始,而是使用熊猫绘图包装程序(原因稍后会变得清楚).

不使用Seaborn

Pandas允许通过使用数据框来绘制分组的条形图,其中每一列都属于或构成一个组.因此,要采取的步骤是

  1. 根据不同的变量的数量创建多个子图.
  2. 按日期
  3. groupby 变量
  4. 为每个组创建一个透视数据框,其中将 grp1 的值作为列,将 m 的值作为值.对两个错误列执行相同的操作.
  5. 使用Seaborn

    Seaborn factorplot不允许使用自定义错误栏.因此,需要使用 FaceGrid 方法.为了不堆积这些条形图,可以将 hue 参数放在 map 调用中.因此,以下内容等同于问题中的 sns.factorplot 调用.

      g = sns.FacetGrid(data = df,col ="variables",size = 4,Aspect = .7)g.map(sns.barplot,"grp2","m","grp1",order = ["A1","A2"]) 

    现在的问题是,我们无法从外部将错误条导入barplot,或更重要的是,我们无法将分组条形图的错误提供给 seaborn.barplot .对于未分组的barplot,可以通过 yerr 参数提供错误,该参数将传递到matplotlib plt.bar 图上.此概念显示在

    I have a Pandas dataframe that has a couple of group columns like below.

    gr1 grp2 variables  lb     m       ub
     A   A1      V1     1.00   1.50    2.5           
     A   A2      V2     1.50   2.50    3.5         
     B   A1      V1     3.50   14.50   30.5           
     B   A2      V2     0.25   0.75    1.0
    

    I am trying to get a separate sub-barplot for each variable in variables using FacetGrid. I am trying to build to the final plot that I need which looks like the below.

    This is what I have so far.

    g = sns.FacetGrid(df, col="variables", hue="grp1")
    g.map(sns.barplot, 'grp2', 'm', order=times)
    

    But unfortunately this is stacking all my datapoints.

    How should I go about doing this with Seaborn?

    UPDATE: The following code largely does what I'm after but currently does not display yerr.

    g = sns.factorplot(x="Grp2", y="m", hue="Grp1", col="variables", data=df, kind="bar", size=4, aspect=.7, sharey=False)
    

    How can I incorporate the lb and ub as error bars on the factorplot?

    解决方案

    Before we start let me mention that matplotlib requires the errors to be relative to the data, not absolute boundaries. We would hence modify the dataframe to account for that by subtracting the respective columns.

    u = u"""grp1 grp2 variables  lb     m       ub
    A   A1      V1     1.00   1.50    2.5           
    A   A2      V2     1.50   2.50    3.5         
    B   A1      V1     7.50   14.50   20.5           
    B   A2      V2     0.25   0.75    1.0
    A   A2      V1     1.00   6.50    8.5           
    A   A1      V2     1.50   3.50    6.5         
    B   A2      V1     3.50   4.50   15.5           
    B   A1      V2     8.25   12.75  13.9"""
    
    import io
    import pandas as pd
    
    df = pd.read_csv(io.StringIO(u), delim_whitespace=True)
    # errors must be relative to data (not absolute bounds)
    df["lb"] = df["m"]-df["lb"]
    df["ub"] = df["ub"]-df["m"]
    

    Now there are two solutions, which are essentially the same. Let's start with a solution which does not use seaborn, but the pandas plotting wrapper (the reason will become clear later).

    Not using Seaborn

    Pandas allows to plot grouped barplots by using dataframes where each column belongs to or constitutes one group. The steps to take are therefore

    1. create a number of subplots according to the number of different variables.
    2. groupby the dateframe by variables
    3. for each group, create a pivoted dataframe, which has the values of grp1 as columns and the m as values. Do the same for the two error columns.
    4. Apply the solution from How add asymmetric errorbars to Pandas grouped barplot?

    The code would then look like:

    import io
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    df = pd.read_csv(io.StringIO(u), delim_whitespace=True)
    # errors must be relative to data (not absolute bounds)
    df["lb"] = df["m"]-df["lb"]
    df["ub"] = df["ub"]-df["m"]
    
    def func(x,y,h,lb,ub, **kwargs):
        data = kwargs.pop("data")
        # from https://stackoverflow.com/a/37139647/4124317
        errLo = data.pivot(index=x, columns=h, values=lb)
        errHi = data.pivot(index=x, columns=h, values=ub)
        err = []
        for col in errLo:
            err.append([errLo[col].values, errHi[col].values])
        err = np.abs(err)
        p = data.pivot(index=x, columns=h, values=y)
        p.plot(kind='bar',yerr=err,ax=plt.gca(), **kwargs)
    
    fig, axes = plt.subplots(ncols=len(df.variables.unique()))
    for ax, (name, group) in zip(axes,df.groupby("variables")):
        plt.sca(ax)
        func("grp2", "m", "grp1", "lb", "ub", data=group, color=["limegreen", "indigo"])
        plt.title(name)
    
    plt.show()
    

    using Seaborn

    Seaborn factorplot does not allow for custom errorbars. One would therefore need to use the FaceGrid approach. In order not to have the bars stacked, one would put the hue argument in the map call. The following is thus the equivalent of the sns.factorplot call from the question.

    g = sns.FacetGrid(data=df, col="variables", size=4, aspect=.7 ) 
    g.map(sns.barplot, "grp2", "m", "grp1", order=["A1","A2"] )
    

    Now the problem is, we cannot get the errorbars into the barplot from the outside or more importantly, we cannot give the errors for a grouped barchart to seaborn.barplot. For a non grouped barplot one would be able to supply the error via the yerr argument, which is passed onto the matplotlib plt.bar plot. This concept is shown in this question. However, since seaborn.barplot calls plt.bar several times, once for each hue, the errors in each call would be the same (or their dimension wouldn't match).

    The only option I see is hence to use a FacetGrid and map exactly the same function as used above to it. This somehow renders the use of seaborn obsolete, but for completeness, here is the FacetGrid solution.

    import io
    import numpy as np
    import pandas as pd
    import seaborn as sns
    import matplotlib.pyplot as plt
    
    df = pd.read_csv(io.StringIO(u), delim_whitespace=True)
    # errors must be relative to data (not absolute bounds)
    df["lb"] = df["m"]-df["lb"]
    df["ub"] = df["ub"]-df["m"]
    
    def func(x,y,h,lb,ub, **kwargs):
        data = kwargs.pop("data")
        # from https://stackoverflow.com/a/37139647/4124317
        errLo = data.pivot(index=x, columns=h, values=lb)
        errHi = data.pivot(index=x, columns=h, values=ub)
        err = []
        for col in errLo:
            err.append([errLo[col].values, errHi[col].values])
        err = np.abs(err)
        p = data.pivot(index=x, columns=h, values=y)
        p.plot(kind='bar',yerr=err,ax=plt.gca(), **kwargs)
    
    g = sns.FacetGrid(df, col="variables", size=4, aspect=.7,  ) 
    g.map_dataframe(func, "grp2", "m", "grp1", "lb", "ub" , color=["limegreen", "indigo"]) 
    g.add_legend()
    
    plt.show()
    

    这篇关于Seaborn:使用不对称的自定义误差线按组制作条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆