pandas / matplotlib:刻面条形图 [英] pandas / matplotlib: faceting bar plots

查看:179
本文介绍了 pandas / matplotlib:刻面条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作一系列具有两个分类变量和一个数字的条形图数据。我有下面这些,但是我想做的就是在 ggplot facet_wrap 中的一个分类变量C $ C>。我有一个有点工作的例子,但我得到了错误的绘图类型(线条而不是线条),并且我在一个循环中对数据进行了子集化 - 这不是最好的方法。

  ## first try  -  plain vanilla 
将pandas导入为pd
导入numpy为np
N = 100

##产生玩具数据
ind = np.random.choice(['a','b','c'],N)
cty = np.random.choice([ 'x','y','z'],N)
jobs = np.random.randint(low = 1,high = 250,size = N)

## prep数据框
df_city = pd.DataFrame({'industry':ind,'city':cty,'jobs':jobs})
df_city_grouped = df_city.groupby(['city','industry' ])。jobs.sum()。unstack()
df_city_grouped.plot(kind ='bar',stacked = True,figsize =(9,6))

这样的结果如下:

  city industry jobs 
0 zb 180
1 zc 121
2 xa 33
3 za 121
4 zc 236



然而,我想看到的是这样的:

  ## R code 
library(plyr)
df_city< -read.csv('/ home / aksel / Downloads / mockcity.csv',sep ='\t')

##总结
df_city_grouped < - ddply(df_city,。(city,industry),summarize,jobstot = sum(jobs))

## plot
ggplot(df_city_grouped,aes(x = industry,y = jobstot))+
geom_bar(stat ='identity')+
facet_wrap(〜city)



我用matplotlib得到的最接近的是这样的:

  cols = df_city.city.value_counts()。shape [0] 
fig,axes = plt.subplots(1,cols,figsize =(8,8))

for x,city在枚举(df_city.city.value_counts()。index.values)中:
data = df_city [(df_city ['city'] == city)]
data = data.groupby(['industry'])。jobs.sum()
axes [x] .plot(data)



所以有两个问题:


  1. 我可以做柱状图(它们如图所示绘制线条)使用AxesSubplot对象,并以 ggplot 示例中的facet_wrap示例的行结束。

  2. 在生成图表的循环(如此尝试)中,我将每个数据分为子集。我无法想象这是做这种刻面的正确方式?
  3. 第二个例子在这里: http://pandas-docs.github.io/pandas-docs-travis/visualization.html#bar-plots



    无论如何,你总是可以亲自操作,就像你自己一样。



    编辑:
    顺便说一句,你总是可以在Python中使用rpy2,所以你可以做所有相同的事情在R。



    另外,看看这个: http://pandas.pydata.org/pandas-docs/stable/rplot。 html
    我不确定,但它应该有助于在许多面板上创建图表,但可能需要进一步阅读。


    I am making a series of bar plots of data with two categorical variables and one numeric. What i have is the below, but what I would love to do is to facet by one of the categorical variables as with facet_wrap in ggplot. I have a somewhat working example, but I get the wrong plot type (lines and not bars) and I do subsetting of the data in a loop--that can't be the best way.

    ## first try--plain vanilla
    import pandas as pd
    import numpy as np
    N = 100
    
    ## generate toy data
    ind = np.random.choice(['a','b','c'], N)
    cty = np.random.choice(['x','y','z'], N)
    jobs = np.random.randint(low=1,high=250,size=N)
    
    ## prep data frame
    df_city = pd.DataFrame({'industry':ind,'city':cty,'jobs':jobs})
    df_city_grouped = df_city.groupby(['city','industry']).jobs.sum().unstack()
    df_city_grouped.plot(kind='bar',stacked=True,figsize=(9, 6))
    

    This gives something like this:

      city industry  jobs
    0    z        b   180
    1    z        c   121
    2    x        a    33
    3    z        a   121
    4    z        c   236
    

    However, what i would like to see is something like this:

    ## R code
    library(plyr)
    df_city<-read.csv('/home/aksel/Downloads/mockcity.csv',sep='\t')
    
    ## summarize
    df_city_grouped <- ddply(df_city, .(city,industry), summarise, jobstot = sum(jobs))
    
    ## plot
    ggplot(df_city_grouped, aes(x=industry, y=jobstot)) +
      geom_bar(stat='identity') +
      facet_wrap(~city)
    

    The closest I get with matplotlib is something like this:

    cols =df_city.city.value_counts().shape[0]
    fig, axes = plt.subplots(1, cols, figsize=(8, 8))
    
    for x, city in enumerate(df_city.city.value_counts().index.values):
        data = df_city[(df_city['city'] == city)]
        data = data.groupby(['industry']).jobs.sum()
        axes[x].plot(data)
    

    So two questions:

    1. Can I do bar plots (they plot lines as shown here) using the AxesSubplot object and end up with something along the lines of the facet_wrap example from ggplot example;
    2. In loops generating charts such as this attempt, I subset the data in each. I can't imagine that is the 'proper' way to do this type of faceting?

    解决方案

    Second example here: http://pandas-docs.github.io/pandas-docs-travis/visualization.html#bar-plots

    Anyway, you can always do that by hand, as you did yourself.

    EDIT: BTW, you can always use rpy2 in python, so you can do all the same things as in R.

    Also, have a look at this: http://pandas.pydata.org/pandas-docs/stable/rplot.html I am not sure, but it should be helpful for creating plots over many panels, though might require further reading.

    这篇关于 pandas / matplotlib:刻面条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆