绘制多索引DataFrame条形图,其中颜色由类别确定 [英] Plotting multiindex DataFrame bar plot where color is determined by category

查看:75
本文介绍了绘制多索引DataFrame条形图,其中颜色由类别确定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个多索引DataFrame,看起来像下面的数据.当我绘制数据时,图形如下图所示.

I have a multiindex DataFrame that looks like the data below. When I plot the data, the graph looks like below.

如何绘制条形图,条形的颜色由所需的类别(例如:城市")决定.因此,与年份无关,属于同一城市的所有酒吧都具有相同的颜色.例如:在下图中,所有ATL条应为红色,而所有MIA条应为蓝色.

How can I plot a bar graph, where the color of the bars is determined by my desired category (ex: 'City'). Thus, all bars belonging to the same city have the same color, regardless of the year. For example: In the graph below, all ATL bars should be red, while all MIA bars should be blue.

City            ATL                                    MIA               \
Year           2010         2011         2012         2010         2011   
Taste                                                                     
Bitter  3159.861983  3149.806667  2042.348937  3124.586470  3119.541240   
Sour    1078.897032  3204.689424  3065.818991  2084.322056  2108.568495   
Spicy   5280.847114  3134.597728  1015.311288  2036.494136  1001.532560   
Sweet   1056.169267  1015.368646  4217.145165  3134.734027  4144.826118   

City                 
Year           2012  
Taste                
Bitter  1070.925695  
Sour    3178.131540  
Spicy   3164.382635  
Sweet   3173.919338 

下面是我的代码:

import sys
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import random

matplotlib.style.use('ggplot')

def main():

    taste = ['Sweet','Spicy','Sour','Bitter']
    store = ['Asian','Italian','American','Greek','Mexican']

    df1 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})

    df2 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})

    df3 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})

    df4 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})

    df5 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})


    df6 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                       'Store':[random.choice(store) for x in range(10)],
                       'Sold':1000+100*np.random.rand(10)})



    df1['Year'] = '2010'
    df1['City'] = 'MIA'

    df2['Year'] = '2011'
    df2['City'] = 'MIA'

    df3['Year'] = '2012'
    df3['City'] = 'MIA'

    df4['Year'] = '2010'
    df4['City'] = 'ATL'

    df5['Year'] = '2011'
    df5['City'] = 'ATL'

    df6['Year'] = '2012'
    df6['City'] = 'ATL'


    DF = pd.concat([df1,df2,df3,df4,df5,df6])
    DFG = DF.groupby(['Taste', 'Year', 'City'])
    DFGSum = DFG.sum().unstack(['Year','City']).sum(axis=1,level=['City','Year'])
    print DFGSum

    '''
    In my plot, I want the color of the bars to be determined by the "City".
    For example: All "ATL" bar colors will be the same regardless of the year.
    '''
    DFGSum.plot(kind='bar')


    plt.show()

if __name__ == '__main__':
    main()

推荐答案

我已经找到了解决自己问题的方法.我将部分归功于最初回答我的问题的@ dermen .我的回答是受他的方法启发的.

I have found a solution to my own question. I give partial credit to @dermen who originally answered my question. My answer was inspired by his approach.

尽管@dermen的解决方案是正确的,但我觉得我需要一种无需手动调整条形宽度或不必担心位置的方法.

Although @dermen's solution is correct, I felt I needed a method where I don't have to manually adjust the width of the bars or worry about positions.

以下解决方案可以适应任意数量的城市,以及该城市的年度数据.重要的是要知道,在下面的解决方案中,要绘制的DataFrame是多级DataFrame.该解决方案在对DataFrame进行排序的情况下可能会中断,因为绘制是按照特定的顺序进行的.

The solution below can be adapted to arbitrary amount of cities, and the yearly data belonging to that city. It is important to know that in the solution below, the DataFrame being plotted is a multilevel DataFrame. The solution may break in situations where the DataFrame is sorted, because plotting occurs in a specific order.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import random

matplotlib.style.use('ggplot')


taste = ['Sweet','Spicy','Sour','Bitter']
store = ['Asian','Italian','American','Greek','Mexican']

df1 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})

df2 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})

df3 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})

df4 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})

df5 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})


df6 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})


df7 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})


df8 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})

df9 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})


df10 = pd.DataFrame({'Taste':[random.choice(taste) for x in range(10)],
                   'Store':[random.choice(store) for x in range(10)],
                   'Sold':1000+100*np.random.rand(10)})



df1['Year'] = '2010'
df1['City'] = 'MIA'

df2['Year'] = '2011'
df2['City'] = 'MIA'

df3['Year'] = '2012'
df3['City'] = 'MIA'

df4['Year'] = '2010'
df4['City'] = 'ATL'

df5['Year'] = '2011'
df5['City'] = 'ATL'

df6['Year'] = '2012'
df6['City'] = 'ATL'


df7['Year'] = '2013'
df7['City'] = 'ATL'

df8['Year'] = '2014'
df8['City'] = 'ATL'

df9['Year'] = '2013'
df9['City'] = 'CHI'

df10['Year'] = '2014'
df10['City'] = 'CHI'

DF = pd.concat([df1,df2,df3,df4,df5,df6,df7,df8,df9,df10])

DFG = DF.groupby(['Taste', 'Year', 'City'])
DFGSum = DFG.sum().unstack(['Year','City']).sum(axis=1,level=['City','Year'])
#DFGSum is a multilevel DataFrame 

import itertools 
color_cycle = itertools.cycle( plt.rcParams['axes.color_cycle']  )

plot_colors = [] #Array for a squenece of colors to be plotted 

for city in DFGSum.columns.get_level_values('City').unique(): 
  set_color = color_cycle.next() #Set the color for the city 
  for year in DFGSum[city].columns.get_level_values('Year').unique():
    plot_colors.append(set_color)
    #For each unqiue city, all the yearly data belonging to that city will have the same color 

DFGSum.plot(kind='bar',color=plot_colors)
# The color pramater of the plot function allows a list of colors sequences to be specified

这篇关于绘制多索引DataFrame条形图,其中颜色由类别确定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆