在groupby之后选择列而不使用显式列名 [英] Selecting column after groupby without using explicit column name
问题描述
具有以下数据集:
import pandas as pd
df = pd.DataFrame({'Date':['26-12-2018','26-12-2018','27-12-2018','27-12-2018','28-12-2018','28-12-2018'],
'In':['A','B','D','Z','Q','E'],
'Out' : ['Z', 'D', 'F', 'H', 'Z', 'A'],
'Score_in' : ['6', '2', '1', '0', '1', '3'],
'Score_out' : ['2','3','0', '1','1','3'],
'Place' : ['One','Two','Four', 'Two','Two','One']})
我想对通用规则的groupby规则进行编码,以尝试对子集的创建进行参数化. 例如,代替以下内容:
I would like to code groupby rules on a generic form in order to try parameterizing subsets creation. For instance, instead of the following:
df.groupby('In').Score_in.sum()
我想我想要的输出将是诸如#1或#2之类的,具有df.columns[]
或.iloc[:,[]]
语法,例如:
I suppose my desired output would be something like #1 or #2 with df.columns[]
or .iloc[:,[]]
syntaxes like:
df.groupby(df.columns[1]).df.iloc[:,[3]].sum() #1
df.groupby(df.iloc[:,[0]]).df.iloc[:,[3]].sum() #2
当然,以上语法都不起作用.有帮助吗?
Of course, none of the above syntaxes works. Any help?
推荐答案
实际上,问题不是出在groupby上,而是关于以后如何保留特定列的问题. groupby
没有df
属性,因此无法通过这种方式工作.
Actually the problem is not with the groupby, it's about how you keep a particular column afterwards. groupby
has no df
attribute, so it can't work this way.
这是一段可以按预期工作的代码:
Here is a piece of code that works as expected:
df.groupby(df.columns[1])[df.columns[3]].sum()
In Score_in
A 6
B 2
D 1
E 3
Q 1
Z 0
注意:我将Score_in和Score_out强制转换为整数,否则groupby无法正常工作.
Notice: I casted Score_in and Score_out as integers or else the groupby would'nt work.
这篇关于在groupby之后选择列而不使用显式列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!