在.groupby()之后更新组中的pandas.DataFrame [英] update pandas.DataFrame within a group after .groupby()

查看:333
本文介绍了在.groupby()之后更新组中的pandas.DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下pandas.DataFrame:

                                                          time
offset   ts                      op                           
0.000000 2015-10-27 18:31:40.318 BuildIndex            282.604
                                 Compress              253.649
                                 Decompress              2.953
                                 Deserialize             0.063
                                 InsertIndex             1.343
4.960683 2015-10-27 18:36:37.959 BuildIndex            312.249
                                 Compress              280.747
                                 Decompress              2.844
                                 Deserialize             0.110
                                 InsertIndex             0.907

现在,我需要更新数据帧(就位就可以了):对于每个组,从同一组中的op == 'BuildIndex'-的时间中减去op == 'Compress'的时间. /p>

在大熊猫中最优雅的方式是什么?

解决方案

我将使用

减法作用于索引标签(在本例中为offset和ts),因此无需分组.

I have the following pandas.DataFrame:

                                                          time
offset   ts                      op                           
0.000000 2015-10-27 18:31:40.318 BuildIndex            282.604
                                 Compress              253.649
                                 Decompress              2.953
                                 Deserialize             0.063
                                 InsertIndex             1.343
4.960683 2015-10-27 18:36:37.959 BuildIndex            312.249
                                 Compress              280.747
                                 Decompress              2.844
                                 Deserialize             0.110
                                 InsertIndex             0.907

Now I need to update the dataframe (in-place is OK): for each group, subtract the time for op == 'Compress' from the one for op == 'BuildIndex' - within the same group.

What is the most elegant way to do it in pandas?

解决方案

I'd use xs (cross-section) to do this:

In [11]: df1.xs("Compress", level="op")
Out[11]:
                                     time
offset   ts
0.000000 2015-10-27 18:31:40.318  253.649
4.960683 2015-10-27 18:36:37.959  280.747

In [12]: df1.xs("BuildIndex", level="op")
Out[12]:
                                     time
offset   ts
0.000000 2015-10-27 18:31:40.318  282.604
4.960683 2015-10-27 18:36:37.959  312.249

In [13]: df1.xs("BuildIndex", level="op") - df1.xs("Compress", level="op")
Out[13]:
                                    time
offset   ts
0.000000 2015-10-27 18:31:40.318  28.955
4.960683 2015-10-27 18:36:37.959  31.502

The subtraction works on the index labels (in this case offset and ts), so no need to group.

这篇关于在.groupby()之后更新组中的pandas.DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆