绘制groupbys时,Seaborn出现“无法解释输入"错误 [英] 'Could not interpret input' error with Seaborn when plotting groupbys

查看:63
本文介绍了绘制groupbys时,Seaborn出现“无法解释输入"错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有这个数据框

d = {     'Path'   : ['abc', 'abc', 'ghi','ghi', 'jkl','jkl'],
          'Detail' : ['foo', 'bar', 'bar','foo','foo','foo'],
          'Program': ['prog1','prog1','prog1','prog2','prog3','prog3'],
          'Value'  : [30, 20, 10, 40, 40, 50],
          'Field'  : [50, 70, 10, 20, 30, 30] }


df = DataFrame(d)
df.set_index(['Path', 'Detail'], inplace=True)
df

               Field Program  Value
Path Detail                      
abc  foo        50   prog1     30
     bar        70   prog1     20
ghi  bar        10   prog1     10
     foo        20   prog2     40
jkl  foo        30   prog3     40
     foo        30   prog3     50

我可以毫无问题地进行汇总(顺便说一句,如果有更好的方法可以做到这一点,我想知道!)

I can aggregate it no problem (if there's a better way to do this, by the way, I'd like to know!)

df_count = df.groupby('Program').count().sort(['Value'], ascending=False)[['Value']]
df_count

Program   Value
prog1    3
prog3    2
prog2    1

df_mean = df.groupby('Program').mean().sort(['Value'], ascending=False)[['Value']]
df_mean

Program  Value
prog3    45
prog2    40
prog1    20

我可以从Pandas绘制它,没问题...

I can plot it from Pandas no problem...

df_mean.plot(kind='bar')

但是为什么在seaborn中尝试时会出现此错误?

But why do I get this error when I try it in seaborn?

sns.factorplot('Program',data=df_mean)
    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-26-23c2921627ec> in <module>()
----> 1 sns.factorplot('Program',data=df_mean)

C:\Anaconda3\lib\site-packages\seaborn\categorical.py in factorplot(x, y, hue, data, row, col, col_wrap, estimator, ci, n_boot, units, order, hue_order, row_order, col_order, kind, size, aspect, orient, color, palette, legend, legend_out, sharex, sharey, margin_titles, facet_kws, **kwargs)
   2673     # facets to ensure representation of all data in the final plot
   2674     p = _CategoricalPlotter()
-> 2675     p.establish_variables(x_, y_, hue, data, orient, order, hue_order)
   2676     order = p.group_names
   2677     hue_order = p.hue_names

C:\Anaconda3\lib\site-packages\seaborn\categorical.py in establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
    143                 if isinstance(input, string_types):
    144                     err = "Could not interperet input '{}'".format(input)
--> 145                     raise ValueError(err)
    146 
    147             # Figure out the plotting orientation

ValueError: Could not interperet input 'Program'

推荐答案

出现此异常的原因是 Program 成为数据帧 df_mean 的索引,并且 group_by 操作后的 df_count .

The reason for the exception you are getting is that Program becomes an index of the dataframes df_mean and df_count after your group_by operation.

如果您想从 df_mean 中获取 factorplot ,一个简单的解决方案是将索引添加为列,

If you wanted to get the factorplot from df_mean, an easy solution is to add the index as a column,

In [7]:

df_mean['Program'] = df_mean.index

In [8]:

%matplotlib inline
import seaborn as sns
sns.factorplot(x='Program', y='Value', data=df_mean)

不过,您甚至可以更简单地让 factorplot 为您进行计算,

However you could even more simply let factorplot do the calculations for you,

sns.factorplot(x='Program', y='Value', data=df)

您将获得相同的结果.希望对您有所帮助.

You'll obtain the same result. Hope it helps.

评论后编辑

实际上,您对参数 as_index 提出了非常好的建议;默认情况下,它设置为True,在这种情况下,就像您的问题一样, Program 成为索引的一部分.

Indeed you make a very good point about the parameter as_index; by default it is set to True, and in that case Program becomes part of the index, as in your question.

In [14]:

df_mean = df.groupby('Program', as_index=True).mean().sort(['Value'], ascending=False)[['Value']]
df_mean

Out[14]:
        Value
Program 
prog3   45
prog2   40
prog1   20

请清楚一点,这种方式 Program 不再是列,而是成为索引.技巧 df_mean ['Program'] = df_mean.index 实际上是保持索引不变,并为索引添加新列,以便现在复制 Program .

Just to be clear, this way Program is not column anymore, but it becomes the index. the trick df_mean['Program'] = df_mean.index actually keeps the index as it is, and adds a new column for the index, so that Program is duplicated now.

In [15]:

df_mean['Program'] = df_mean.index
df_mean

Out[15]:
        Value   Program
Program     
prog3   45  prog3
prog2   40  prog2
prog1   20  prog1

但是,如果将 as_index 设置为False,则会将 Program 作为列,加上新的自动增量索引,

However, if you set as_index to False, you get Program as a column, plus a new autoincrement index,

In [16]:

df_mean = df.groupby('Program', as_index=False).mean().sort(['Value'], ascending=False)[['Program', 'Value']]
df_mean

Out[16]:
    Program Value
2   prog3   45
1   prog2   40
0   prog1   20

这样,您可以将其直接喂入 seaborn .不过,您可以使用 df 并获得相同的结果.

This way you could feed it directly to seaborn. Still, you could use df and get the same result.

希望有帮助.

这篇关于绘制groupbys时,Seaborn出现“无法解释输入"错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆