如何添加新列来添加和汇总现有列的计数? [英] How do I add new column that adds and sums counts from existing column?

查看:46
本文介绍了如何添加新列来添加和汇总现有列的计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个python代码:

I have this python code:

counting_bach_new = counting_bach.groupby(['User Name', 'time_diff', 'Logon Time']).size()
print("\ncounting_bach_new")
print(counting_bach_new)

...得到这个整洁的结果:

...getting this neat result:

counting_bach_new
User Name  time_diff            Logon Time
122770     -132 days +21:38:00  1             1
           -122 days +00:41:00  1             1
123526     -30 days +12:04:00   1             1
           -29 days +16:39:00   1             1
           -27 days +18:16:00   1             1
                                             ..
201685     -131 days +21:21:00  1             1
202047     -106 days +10:14:00  1             1
202076     -132 days +10:22:00  1             1
           -132 days +14:46:00  1             1
           -131 days +21:21:00  1             1

那么如何添加新的列来添加和汇总现有列的计数?应该忽略最右边带有 1 的列,而另一方面,我想添加一个新列,总结每个用户名"的时间差异"的计数,即新列中的结果应该求和 #每个用户列出的观察结果.总结 # of time_diffs 或登录时间.对于用户名 122770,新列的总和应为 2,对于 123526,其总和应为 3,依此类推....

So how do I add new column that adds and sums counts from existing column? The rightmost column with 1's should be disregarded, while I--on the other hand--would like to add a new column, summing up counts of 'time diff's per 'User Name', i.e. the result in the new col should sum # of observations listed per user. Either summing up # of time_diffs or Logon Time's. For User Name 122770 the new col should sum up to 2, for 123526 it should sum up to 3, and so on....

我尝试了几次尝试,包括(但不起作用)...

I tried several attempts, including (but not working)...

counting_bach_new.groupby('User Name').agg(MySum=('Logon Time', 'sum'), MyCount=('Logon Time', 'count'))

任何帮助将不胜感激.谢谢你的支持.来自@Hubsandspokes 的圣诞问候

Any help would be appreciated. Thank you, for your kind support. Christmas Greetings from @Hubsandspokes

推荐答案

使用 DataFrame.joinSeries.reset_index:

df = (counting_bach_new.to_frame('count')
                       .join((counting_bach_new.reset_index()
                                .groupby('User Name')
                                .agg(MySum=('Logon Time', 'sum'),
                                     MyCount=('Logon Time', 'count'))), on='User Name'))
print (df)
                                          count  MySum  MyCount
User Name time_diff           Logon Time                       
122770    -132 days +21:38:00 1               1      2        2
          -122 days +00:41:00 1               1      2        2
123526    -30 days +12:04:00  1               1      3        3
          -29 days +16:39:00  1               1      3        3
          -27 days +18:16:00  1               1      3        3
201685    -131 days +21:21:00 1               1      1        1
202047    -106 days +10:14:00 1               1      1        1
202076    -132 days +10:22:00 1               1      3        3
          -132 days +14:46:00 1               1      3        3
          -131 days +21:21:00 1               1      3        3

这篇关于如何添加新列来添加和汇总现有列的计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆