列的MultiLevel索引:将value_counts作为pandas中的列 [英] MultiLevel index to columns : getting value_counts as columns in pandas
本文介绍了列的MultiLevel索引:将value_counts作为pandas中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
从一般意义上讲,我要解决的问题是将多级索引的一个组件更改为列。也就是说,我有一个 Series
,它包含一个多级索引,我希望将最低级别的索引更改为 dataframe $ c中的列$ C>。以下是我试图解决的实际示例问题,
In a very general sense, the problem I am looking to solve is changing one component of a multi-level index into columns. That is, I have a Series
that contains a multilevel index and I want the lowest level of the index changed into columns in a dataframe
. Here is the actual example problem I'm trying to solve,
这里我们可以生成一些示例数据:
Here we can generate some sample data:
foo_choices = ["saul", "walter", "jessee"]
bar_choices = ["alpha", "beta", "foxtrot", "gamma", "hotel", "yankee"]
df = DataFrame([{"foo":random.choice(foo_choices),
"bar":random.choice(bar_choices)} for _ in range(20)])
df.head()
这给了我们,
bar foo
0 beta jessee
1 gamma jessee
2 hotel saul
3 yankee walter
4 yankee jessee
...
现在,我可以分组 bar
并获取 foo
字段的value_counts,
Now, I can groupby bar
and get value_counts of the foo
field,
dfgb = df.groupby('foo')
dfgb['bar'].value_counts()
并输出,
foo
jessee hotel 4
gamma 2
yankee 1
saul foxtrot 3
hotel 2
gamma 1
alpha 1
walter hotel 2
gamma 2
foxtrot 1
beta 1
但我想要的是,
hotel beta foxtrot alpha gamma yankee
foo
jessee 1 1 5 4 1 1
saul 0 3 0 0 1 0
walter 1 0 0 1 1 0
我的解决方案是编写以下内容:
My solution was to write the following bit:
for v in df['bar'].unique():
if v is np.nan: continue
df[v] = np.nan
df.ix[df['bar'] == v, v] = 1
dfgb = df.groupby('foo')
dfgb.count()[df['bar'].unique()]
推荐答案
我想你想要:
dfgb['bar'].value_counts().unstack().fillna(0.)
这篇关于列的MultiLevel索引:将value_counts作为pandas中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文