扁平化多索引列的简洁方法 [英] concise way of flattening multiindex columns
问题描述
在groupby-aggregate中使用多个函数会产生一个多索引,然后我希望将其展平.
Using more than 1 function in a groupby-aggregate results in a multi-index which I then want to flatten.
示例:
df = pd.DataFrame(
{'A': [1,1,1,2,2,2,3,3,3],
'B': np.random.random(9),
'C': np.random.random(9)}
)
out = df.groupby('A').agg({'B': [np.mean, np.std], 'C': np.median})
# example output
B C
mean std median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
目前,我是这样手动完成的
Currently, I do it manually like this
out.columns = ['B_mean', 'B_std', 'C_median']
这给了我想要的结果
B_mean B_std C_median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
但是我正在寻找一种自动执行此过程的方法,因为这是单调的,耗时的,并且允许我在重命名列时进行拼写错误.
but I'm looking for a way to automate this process, as this is monotonous, time consuming and allows me to make typos as I rename the columns.
在进行groupby-aggregate时,是否有一种方法可以返回扁平索引而不是多索引?
Is there a way to return a flattened index instead of a multi-index when doing a groupby-aggregate?
我需要弄平列以保存到文本文件,然后将由不处理多索引列的其他程序读取该文件.
I need to flatten the columns to save to a text file, which will then be read by a different program that doesn't handle multi-indexed columns.
推荐答案
您可以对列进行map
join
out.columns = out.columns.map('_'.join)
out
Out[23]:
B_mean B_std C_median
A
1 0.204825 0.169408 0.926347
2 0.362184 0.404272 0.224119
3 0.533502 0.380614 0.218105
出于某种原因(当列包含int时)我更喜欢这种方式
For some reason (when the column contain int) I like this way better
out.columns.map('{0[0]}_{0[1]}'.format)
Out[27]: Index(['B_mean', 'B_std', 'C_median'], dtype='object')
这篇关于扁平化多索引列的简洁方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!