列的MultiLevel索引:将value_counts作为pandas中的列 [英] MultiLevel index to columns : getting value_counts as columns in pandas

查看:118
本文介绍了列的MultiLevel索引:将value_counts作为pandas中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从一般意义上讲,我要解决的问题是将多级索引的一个组件更改为列。也就是说,我有一个 Series ,它包含一个多级索引,我希望将最低级别的索引更改为 dataframe 。以下是我试图解决的实际示例问题,

In a very general sense, the problem I am looking to solve is changing one component of a multi-level index into columns. That is, I have a Series that contains a multilevel index and I want the lowest level of the index changed into columns in a dataframe. Here is the actual example problem I'm trying to solve,

这里我们可以生成一些示例数据:

Here we can generate some sample data:

foo_choices = ["saul", "walter", "jessee"]
bar_choices = ["alpha", "beta", "foxtrot", "gamma", "hotel", "yankee"]

df = DataFrame([{"foo":random.choice(foo_choices), 
                 "bar":random.choice(bar_choices)} for _ in range(20)])
df.head()

这给了我们,

     bar     foo
0    beta    jessee
1    gamma   jessee
2    hotel   saul
3    yankee  walter
4    yankee  jessee
...

现在,我可以分组 bar 并获取 foo 字段的value_counts,

Now, I can groupby bar and get value_counts of the foo field,

dfgb = df.groupby('foo')
dfgb['bar'].value_counts()

并输出,

foo            
jessee  hotel      4
        gamma      2
        yankee     1
saul    foxtrot    3
        hotel      2
        gamma      1
        alpha      1
walter  hotel      2
        gamma      2
        foxtrot    1
        beta       1

但我想要的是,

          hotel    beta    foxtrot    alpha    gamma    yankee
foo                        
jessee     1       1       5          4        1        1
saul       0       3       0          0        1        0
walter     1       0       0          1        1        0

我的解决方案是编写以下内容:

My solution was to write the following bit:

for v in df['bar'].unique():
    if v is np.nan: continue
    df[v] = np.nan
    df.ix[df['bar'] == v, v] = 1

dfgb = df.groupby('foo')
dfgb.count()[df['bar'].unique()]


推荐答案

我想你想要:

dfgb['bar'].value_counts().unstack().fillna(0.)

这篇关于列的MultiLevel索引:将value_counts作为pandas中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆