pandas :获取组最小值和相应的索引值 [英] Pandas: get group-minima together with corresponding index value

查看:151
本文介绍了 pandas :获取组最小值和相应的索引值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为一个简单的示例,请考虑以下熊猫数据框:

As a simple example, consider the following pandas dataframe:

import pandas as pd

headers = ["city", "year", "births", "deaths", "immigrations", "emigrations"]
data = [
    ["Gotham", 2016, 1616, 1020, 1541, 1893],
    ["Gotham", 2015, 1785, 1708, 1604, 1776],
    ["Gotham", 2014, 1279, 1946, 1991, 1169],
    ["Gotham", 2013, 1442, 1932, 1960, 1580],
    ["Metropolis", 2016, 6405, 6393, 5390, 6797],
    ["Metropolis", 2015, 6017, 5492, 5647, 6994],
    ["Metropolis", 2014, 6644, 6893, 6759, 5149],
    ["Metropolis", 2013, 6902, 6160, 5294, 5112],
    ["Smallville", 2016, 43, 10, 29, 48],
    ["Smallville", 2015, 16, 21, 17, 19],
    ["Smallville", 2014, 20, 31, 28, 43],
    ["Smallville", 2013, 46, 11, 25, 25],
]

df = pd.DataFrame(data, columns=headers)
df.set_index(["city", "year"], inplace=True)

在控制台输出中如下所示:

which looks like this in console output:

                 births  deaths  immigrations  emigrations
city       year
Gotham     2016    1616    1020          1541         1893
           2015    1785    1708          1604         1776
           2014    1279    1946          1991         1169
           2013    1442    1932          1960         1580
Metropolis 2016    6405    6393          5390         6797
           2015    6017    5492          5647         6994
           2014    6644    6893          6759         5149
           2013    6902    6160          5294         5112
Smallville 2016      43      10            29           48
           2015      16      21            17           19
           2014      20      31            28           43
           2013      46      11            25           25

问题

对于每个数据列,我想知道每个城市的最低限额以及发生的年份.基本上,我正在尝试获取如下所示的结果数据框:

Problem

For each data column I'd like to know the per-city minimum, together with the year in which it occurred. Basically, I'm trying to obtain a result dataframe that looks like this:

            births       deaths       immigrations       emigrations
               min  year    min  year          min  year         min  year
city
Gotham        1279  2014   1020  2016         1541  2016        1169  2014
Metropolis    6017  2015   5492  2015         5294  2013        5112  2013
Smallville      16  2015     10  2016           17  2015          19  2015

到目前为止已尝试

我能够获得每个城市的最小值,如下所示:

Tried thus far

I was able to get the per-city minimum values as follows:

df.groupby(level="city").min()

但是在那之后我被卡住了.我还没有找到一种方法来获取与最小值对应的年份.这里有人有解决这个问题的好主意吗?

However after that I'm stuck. I haven't been able to find a way to also get the years corresponding to the minimum values. Does anyone here have a good idea for solving this?

推荐答案

In [180]: df.reset_index(level=0).groupby('city').agg(['min','idxmin','max','idxmax'])
Out[180]:
           births                     deaths                     immigrations  \
              min idxmin   max idxmax    min idxmin   max idxmax          min
city
Gotham       1279   2014  1785   2015   1020   2016  1946   2014         1541
Metropolis   6017   2015  6902   2013   5492   2015  6893   2014         5294
Smallville     16   2015    46   2013     10   2016    31   2014           17

                               emigrations
           idxmin   max idxmax         min idxmin   max idxmax
city
Gotham       2016  1991   2014        1169   2014  1893   2016
Metropolis   2013  6759   2014        5112   2013  6994   2015
Smallville   2015    29   2016          19   2015    48   2016

这篇关于 pandas :获取组最小值和相应的索引值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆