pandas 在小组和枢纽报告中名列前茅 [英] Pandas report top-n in group and pivot

查看：95 发布时间：2020/5/24 2:38:35 python pandas pivot-table top-n

本文介绍了 pandas 在小组和枢纽报告中名列前茅的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图通过沿单个维度d1分组并报告d1的每个元素的摘要统计信息来总结数据帧.我尤其对许多指标的前n个(索引和值)感兴趣. 我想为d1的每个元素生成一行.

I am trying to summarise a dataframe by grouping along a single dimension d1 and reporting summary statistics for each element of d1. In particular I am interested in the top n (index and values) for a number of metrics. what I would like to produce is a row for each element of d1.

说我有两个维度d1，d2和4个指标m1，m2，m3，m4

Say I have two dimensions d1, d2 and 4 metrics m1,m2,m3, m4

1)对于度量m1-m4中的每个度量，建议按d1分组并找到顶部n d2和度量值的建议方式是什么.

1) what is the suggested way of grouping by d1, and finding the top n d2 and metric value, for each of metrics m1 - m4.

他建议在Wes的书《 Python for Data Analysis》(第35页)中

in Wes's book Python for Data Analysis he suggests (page 35)

def get_top1000(group):
 return group.sort_index(by='births', ascending=False)[:1000]
grouped = names.groupby(['year', 'sex'])
top1000 = grouped.apply(get_top1000)

这仍然是推荐的方法吗(我只对1000秒中的前5个d2感兴趣，并且对多个指标感兴趣) 2)现在下一个问题是，我想旋转前5个(即，我对d1的每个元素都有一行)

Is that still the recommended way ( i am only interested in say top 5 d2 out of 1000s, and for multiple metrics) 2) Now next problem is that I want to to pivot the top 5 ( ie so I have one row for each element of d1)

因此，对于维度d1，d2和指标m1，结果数据帧应如下所示: 索引d1和d2的前5个值以及相应的m1的列

so resulting data frame should look like this for dimensions d1,d2 and metric m1: index d1 and columns for top 5 values of d2 and corresponding values of m1

d1 d2-1 d2-2 d2-3 d2-4 d2-5 m1-1 m1-2 m1-3 m1-4 m1-5

....

因此要枢轴化，我必须沿着d2(即1到5-这是我的栏位字段)创建排名.如果我总是有5个条目，这很容易，但是对于给定的d1值，有时d2的元素少于5个.

so to pivot I have to create the ranking along d2 (ie 1 to 5 - this is my columns field). This would be easy if I always had 5 entries, but occasionally there are fewer than 5 elements of d2 for a given value of d1.

有人可以建议如何为分组添加排名，以便我有正确的列索引来执行数据透视

so could someone suggest how to add ranking to the grouping, so that I have the correct column index to perform the pivoting

pandas 在小组和枢纽报告中名列前茅 [英] Pandas report top-n in group and pivot

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 在小组和枢纽报告中名列前茅 [英] Pandas report top-n in group and pivot

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭