在matplotlib中创建分散的堆积条形图 [英] Create a Diverging Stacked Bar Chart in matplotlib

查看:35
本文介绍了在matplotlib中创建分散的堆积条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据列表,表示以1(非常不高兴)至5(非常高兴)的量表回答了李克特问题.我想创建一个绘图页面,将这些列表显示为倾斜的堆叠水平条形图.回复列表可以有不同的大小(例如,当有人选择不回答特定问题时).这是数据的一个最小示例:

  likert1 = [1.0,2.0,1.0,2.0,1.0,3.0,3.0,4.0,4.0,1.0,1.0]likert2 = [5.0, 4.0, 5.0, 4.0, 5.0, 3.0]

我希望能够用以下内容绘制此图:

  plot_many_likerts(likert1,likert2)

目前我已经编写了一个函数来迭代列表,并将每个列表作为自己的子图绘制在 matplotlib 中的共享图形上:

def plot_many_likerts(*lsts):#获取此图的图形和轴列表图, axlst = plt.subplots(len(lsts), sharex=True)对于范围内的我(len(lsts)):likert_horizo​​ntal_bar_list(lsts[i], axlst[i], xaxis=[1.0, 2.0, 3.0, 4.0, 5.0])axlst [i] .axis('off')图.show()def likert_horizo​​ntal_bar_list(lst,ax,xaxis):cnt = 计数器(lst)#del (cnt[无])我= 0color_float = 0.00001前一个右 = 0对于排序(xaxis)中的键:ax.barh(bottom=0, width=cnt[key], height=0.4, left=previous_right, color=plt.cm.jet(colour_float),label=str(key))我 += 1previous_right = previous_right + cnt [key]colour_float = float(i)/float(len(xaxis))

这可以很好地工作,并创建具有相同代表尺寸(例如,宽度共享公共轴刻度)的堆叠条形图.这是一个屏幕截图:

[2] Heiberger,Richard M.和Naomi B. Robbins.为李克特量表和其他应用设计发散堆积条形图."统计软件杂志57.5(2014):1-32.

I have lists of data indicating responses to likert questions with a one (very unhappy) to five (very happy) scale. I would like to create a page of plots showing these lists as skewed stacked horizontal bar charts. The lists of responses can be of different sizes (e.g. when someone has opted out of answering a particular question). Here is a minimal example of the data:

likert1 = [1.0, 2.0, 1.0, 2.0, 1.0, 3.0, 3.0, 4.0, 4.0, 1.0, 1.0]
likert2 = [5.0, 4.0, 5.0, 4.0, 5.0, 3.0]

I would like to be able to plot this with something like:

plot_many_likerts(likert1, likert2)

At the moment I've written a function to iterate over the lists, and plot each one as its own subplot on a shared figure in matplotlib:

def plot_many_likerts(*lsts):
    #get the figure and the list of axes for this plot
    fig, axlst = plt.subplots(len(lsts), sharex=True)
    for i in range(len(lsts)):
        likert_horizontal_bar_list(lsts[i], axlst[i], xaxis=[1.0, 2.0, 3.0, 4.0, 5.0])
        axlst[i].axis('off')
    fig.show()

def likert_horizontal_bar_list(lst, ax, xaxis):
    cnt = Counter(lst)
    #del (cnt[None])
    i = 0
    colour_float = 0.00001
    previous_right = 0
    for key in sorted(xaxis):
        ax.barh(bottom=0, width=cnt[key], height=0.4, left=previous_right, color=plt.cm.jet(colour_float),label=str(key))
        i += 1
        previous_right = previous_right + cnt[key]
       colour_float = float(i) / float(len(xaxis))

This works not badly and create stacked bar charts all with the same representative sizes (e.g. the widths share common axis scales). Here is a screen shot:

What is currently Produced http://s7.postimg.org/vh0j816gn/figure_1.jpg

What I would like is to have these two plots centered on midpoints of the mode of the datasets (the datasets will have the same range). For instance:

What I would like to see http://s29.postimg.org/z0qwv4ryr/figure_2.jpg

Suggestions on how I might do this?

解决方案

I needed to make a divergent bar chart for some likert data. I was using pandas, but the approach would probably be similar without it. The key mechanism is to add in an invisible buffer at the start.

likert_colors = ['white', 'firebrick','lightcoral','gainsboro','cornflowerblue', 'darkblue']
dummy = pd.DataFrame([[1,2,3,4, 5], [5,6,7,8, 5], [10, 4, 2, 10, 5]],
                     columns=["SD", "D", "N", "A", "SA"],
                    index=["Key 1", "Key B", "Key III"])
middles = dummy[["SD", "D"]].sum(axis=1)+dummy["N"]*.5
longest = middles.max()
complete_longest = dummy.sum(axis=1).max()
dummy.insert(0, '', (middles - longest).abs())

dummy.plot.barh(stacked=True, color=likert_colors, edgecolor='none', legend=False)
z = plt.axvline(longest, linestyle='--', color='black', alpha=.5)
z.set_zorder(-1)

plt.xlim(0, complete_longest)
xvalues = range(0,complete_longest,10)
xlabels = [str(x-longest) for x in xvalues]
plt.xticks(xvalues, xlabels)
plt.show()

There are many limitations to this approach. First, bars no longer get a black outline, and the legend will have an extra blank element. I just hid the legend (I figure there's probably a way to hide just the individual element). I'm not sure of a convenient way to make the bars have an outline without also adding the outline to the buffer element.

First, we establish some colors and dummy data. Then we calculate the width of the left two columns and half of the middle-most column (which i know to be "SD", "D", and "N", respectively). I find the longest column, and use its width to calculate the difference needed for the other columns. Next, I insert this new buffer column into the first column position with a blank title (which felt gross, lemme tell you). For good measure, I also added a vertical line (axvline) behind the middle of the middle bar based on the advice of [2]. Finally, I adjust the x-axis to have the proper scale by offsetting its labels.

You might want more horizontal space on the left - you can easily do so by adding to "longest".

[2] Heiberger, Richard M., and Naomi B. Robbins. "Design of diverging stacked bar charts for Likert scales and other applications." Journal of Statistical Software 57.5 (2014): 1-32.

这篇关于在matplotlib中创建分散的堆积条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆