使用 pandas 查找分组行的最小值 [英] Using Pandas to Find Minimum Values of Grouped Rows
问题描述
这可能是一个琐碎的问题,但我仍在尝试找出pandas/numpy.
This might be a trivial question but I'm still trying to figure out pandas/numpy.
因此,假设我有一个具有以下结构的表:
So, suppose I have a table with the following structure:
group_id | col1 | col2 | col3 | "A" | "B"
x | 1 | 2 | 3 | NaN | 1
x | 3 | 2 | 3 | 1 | 1
x | 4 | 2 | 3 | 2 | 1
y | 1 | 2 | 3 | NaN | 3
y | 3 | 2 | 3 | 3 | 3
z | 3 | 2 | 3 | 10 | 2
z | 2 | 2 | 3 | 6 | 2
z | 4 | 2 | 3 | 4 | 2
z | 4 | 2 | 3 | 2 | 2
请注意,有一个group_id将每行中的元素分组. 因此,一开始,我具有group_id和col1-col3列的值.
Note that there is a group_id that groups elements in each row. So at the beginning, I have the values for columns group_id and col1-col3.
然后对于每一行,如果col1,col2或col3的值= 1,则"A"为NaN,否则该值基于公式(此处不相关,因此我在其中放置了一些数字).
Then for each row, if col1, col2, or col3 have value = 1, then "A" is NaN, otherwise the value is based on a formula (irrelevant for here so I put some numbers in place).
那,我知道该怎么做:
df["A"] = np.where(((df['col1'] == 1)|(df['col2']== 1) | (df['col3']) == 1))), NaN, value)
但是对于"B"列,我需要使用特定组中A列的最小值填充它.
But for column "B", I need to fill it in with the minimum of values from column A for a specific group.
因此,例如,对于所有具有组X的行,"B"等于"1",因为所有"x"组行的A列中的最小值等于1.
So for example, "B" is equal to "1" for all rows with group X because the minimum value in column A for all of the group "x" rows is equal to 1.
类似地,对于组"y"中的行,最小值是3,而对于组"z",最小值是2.我如何使用熊猫来做到这一点?这让我有些困惑,因为特定组的行数可能会有所不同.
Similarly, for rows in group "y", the minimum value is 3, and for group "z" the minimum value is 2. How exactly do I do that using pandas...? It's confusing me a little more because the number of rows for a specific group can be of varying size.
如果它们都是相同的大小,我只能说用预设范围内的最小值填充它.
If they were all the same size I could just say fill it with the minimum of values in a pre-set range.
我希望这是有道理的;请让我知道是否应该提供更清晰的示例或澄清任何内容!
I hope that made sense; please let me know if I should provide a clearer example or clarify anything!
推荐答案
要获取每个组的A列最小值,请使用transform
To get the minimum of column A for each group use transform
df.groupby('group_id')['A'].transform('min')
这篇关于使用 pandas 查找分组行的最小值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!