MatPlotLib 中的 100% 堆积条形图 [英] 100% Stacked Bar Chart in MatPlotLib

查看:99
本文介绍了MatPlotLib 中的 100% 堆积条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用来自

有 10 所大学,这已经是一个相当繁忙的情节 - 有 100 所大学,这实际上是不可读的:

我可以保证,在将近8000所大学中,这种堆积的条形图将是完全不可读的.也许考虑另一种表示数据的方式?

I'm trying to create a 100% Stacked Bar Chart in MatPlotLib using the College Scorecard data from this site.

There are 38 columns that are: Percentage of degrees awarded in [insert area of study here] This explains why there are 38 fields!

And I have a subset of schools for which I'd like to do this stacked chart for.

I tried to follow the instructions here. Yes. It's pretty long code but I wanted to play it by the book. (plus I've always had good luck with this blog) The data came with these PCIP (Percent of degrees awarded by area of study), came in percentage form so I didn't have to follow Chris' calculations because they were already completed.

I'm getting errors when I run the code:

bar_width = 1
bar_l = [i for i in range(len(df['PCIP01']))]
tick_pos = [i+(bar_width/2) for i in bar_l]

# Create a figure with a single subplot
f, ax = plt.subplots(1, figsize=(10,5))

ax.bar(bar_l,
       degrees.PCIP01,
       label='PCIP01',
       alpha=0.9,
       color='#2D014B',
       width=bar_width
       )
ax.bar(bar_l,
       PCIP04,
       label='PCIP04',
       alpha=0.9,
       color='#28024E',
       width=bar_width
       )

[and so on and so forth for all of the remaining 36 fields

# Set the ticks to be School names
plt.xticks(tick_pos, degrees['INSTNM'])
ax.set_ylabel("Percentage")
ax.set_xlabel("")
# Let the borders of the graphic
plt.xlim([min(tick_pos)-bar_width, max(tick_pos)+bar_width])
plt.ylim(-10, 110)

# rotate axis labels
plt.setp(plt.gca().get_xticklabels(), rotation=45, horizontalalignment='right')

# shot plot

And this was the error I received:

ValueError                                Traceback (most recent call last)
<ipython-input-91-019d33be36c2> in <module>()
      7        alpha=0.9,
      8        color='#2D014B',
----> 9        width=bar_width
     10        )
     11 ax.bar(bar_l,

C:\Users\MYLOCATION\Anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1889                     warnings.warn(msg % (label_namer, func.__name__),
   1890                                   RuntimeWarning, stacklevel=2)
-> 1891             return func(ax, *args, **kwargs)
   1892         pre_doc = inner.__doc__
   1893         if pre_doc is None:

C:\Users\MYLOCATION\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in bar(self, left, height, width, bottom, **kwargs)
   2077         if len(height) != nbars:
   2078             raise ValueError("incompatible sizes: argument 'height' "
-> 2079                               "must be length %d or scalar" % nbars)
   2080         if len(width) != nbars:
   2081             raise ValueError("incompatible sizes: argument 'width' "

ValueError: incompatible sizes: argument 'height' must be length 38678 or scalar

Can anyone assist me with simplfying this code so I can create this stacked 100% Bar Chart?

解决方案

Firstly, there are a lot of universities in this dataset, maybe a stacked bar plot isn't the best idea?

Anyway, you can loop through each type of degree and add another bar. To create a stacked bar you just change the bottom position of each bar.

import pandas as pd
import matplotlib.pyplot as plt
from cycler import cycler
import numpy as np

df = pd.read_csv('scorecard.csv')
df = df.ix[0:10]
degList = [i for i in df.columns if i[0:4]=='PCIP']
bar_l = range(df.shape[0])

cm = plt.get_cmap('nipy_spectral')

f, ax = plt.subplots(1, figsize=(10,5))
ax.set_prop_cycle(cycler('color',[cm(1.*i/len(degList)) for i in range(len(degList))]))

bottom = np.zeros_like(bar_l).astype('float')
for i, deg in enumerate(degList):
    ax.bar(bar_l, df[deg], bottom = bottom, label=deg)
    bottom += df[deg].values

ax.set_xticks(bar_l)
ax.set_xticklabels(df['INSTNM'].values, rotation=90, size='x-small')
ax.legend(loc="upper left", bbox_to_anchor=(1,1), ncol=2, fontsize='x-small')
f.subplots_adjust(right=0.75, bottom=0.4)
f.show()

You can modify this code to get exactly what you want (for example it seems you want percentage rather than fraction, so just multiply each degree column by 100). For testing I took the first 10 universities which results in this plot:

With 10 universities it is already quite a busy plot - with 100 universities it is practically unreadable:

I can guarantee that with almost 8000 universities this stacked bar plot will be completely unreadable. Maybe consider another way to represent the data?

这篇关于MatPlotLib 中的 100% 堆积条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆