使用 pandas 查找分组行的最小值 [英] Using Pandas to Find Minimum Values of Grouped Rows

查看:131
本文介绍了使用 pandas 查找分组行的最小值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是一个琐碎的问题,但我仍在尝试找出pandas/numpy.

This might be a trivial question but I'm still trying to figure out pandas/numpy.

因此,假设我有一个具有以下结构的表:

So, suppose I have a table with the following structure:

group_id | col1 | col2 | col3 |  "A"   |  "B"
   x     |   1  |   2  |  3   |  NaN   |   1
   x     |   3  |   2  |  3   |   1    |   1 
   x     |   4  |   2  |  3   |   2    |   1
   y     |   1  |   2  |  3   |  NaN   |   3 
   y     |   3  |   2  |  3   |   3    |   3 
   z     |   3  |   2  |  3   |   10   |   2
   z     |   2  |   2  |  3   |   6    |   2
   z     |   4  |   2  |  3   |   4    |   2
   z     |   4  |   2  |  3   |   2    |   2

请注意,有一个group_id将每行中的元素分组. 因此,一开始,我具有group_id和col1-col3列的值.

Note that there is a group_id that groups elements in each row. So at the beginning, I have the values for columns group_id and col1-col3.

然后对于每一行,如果col1,col2或col3的值= 1,则"A"为NaN,否则该值基于公式(此处不相关,因此我在其中放置了一些数字).

Then for each row, if col1, col2, or col3 have value = 1, then "A" is NaN, otherwise the value is based on a formula (irrelevant for here so I put some numbers in place).

那,我知道该怎么做:

df["A"] = np.where(((df['col1'] == 1)|(df['col2']== 1) | (df['col3']) == 1))), NaN, value)

但是对于"B"列,我需要使用特定组中A列的最小值填充它.

But for column "B", I need to fill it in with the minimum of values from column A for a specific group.

因此,例如,对于所有具有组X的行,"B"等于"1",因为所有"x"组行的A列中的最小值等于1.

So for example, "B" is equal to "1" for all rows with group X because the minimum value in column A for all of the group "x" rows is equal to 1.

类似地,对于组"y"中的行,最小值是3,而对于组"z",最小值是2.我如何使用熊猫来做到这一点?这让我有些困惑,因为特定组的行数可能会有所不同.

Similarly, for rows in group "y", the minimum value is 3, and for group "z" the minimum value is 2. How exactly do I do that using pandas...? It's confusing me a little more because the number of rows for a specific group can be of varying size.

如果它们都是相同的大小,我只能说用预设范围内的最小值填充它.

If they were all the same size I could just say fill it with the minimum of values in a pre-set range.

我希望这是有道理的;请让我知道是否应该提供更清晰的示例或澄清任何内容!

I hope that made sense; please let me know if I should provide a clearer example or clarify anything!

推荐答案

要获取每个组的A列最小值,请使用transform

To get the minimum of column A for each group use transform

df.groupby('group_id')['A'].transform('min')

这篇关于使用 pandas 查找分组行的最小值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆