按条件应用于 pandas 中同一列 [英] Count by condition applied to the same column in Pandas

查看:52
本文介绍了按条件应用于 pandas 中同一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的数据框.

acc_index    veh_count    veh_type
001             1            1
002             2            1
002             2            2
003             2            1
003             2            2
004             1            1
005             2            1
005             2            3
006             1            2
007             2            1
007             2            2
008             2            1
008             2            1
009             3            1
009             3            1
009             3            2

acc_index对于每次事故都是唯一的

acc_index is unique for each accident

veh_count显示一次事故涉及多少辆车

veh_count shows how many vehicles are involved in one accident

veh_type显示发生事故的车辆类型(1 =自行车,2 =汽车,3 =公共汽车).

veh_type shows the type of vehicles involved in an accident (1=bicycle, 2=car, 3=bus).

我想做的是计算汽车和自行车之间的事故数(因此,其中veh_type = 1和veh_type = 9对于相同的acc_index ),即使有更多的汽车或自行车,涉及自行车,我仍然想将其视为一次事故.我该怎么办?

What I want to do is to count the number of accidents between cars and bicycles (so, where veh_type=1 and veh_type=9 for the same acc_index), even if there were more cars or bicycles involved, I still want to count it as one accident. How can I do that?

我尝试使用下面的代码进行操作,但是我得到了涉及汽车或自行车的所有事故的计数,并且我想仅获得这两者之间的事故.

I tried to do it with the code below, but I get the count of all accidents involving cars or bikes, and I want to get only the ones between them.

df[(df['veh_count'] >=2) & (df.veh_type.isin(['1','2']))].groupby(['acc_index', 'veh_count', 'veh_type']).count()

我想在下面获得类似的信息,但也要获得整个数据帧的信息,而不仅仅是总和.

I want to get something like this below, but also with the whole dataframe, and not only total sum.

acc_index    veh_count    veh_type     count
002             2            1           
002             2            2
                           count         1
003             2            1
003             2            2
                           count         1
007             2            1
007             2            2
                           count         1
009             3            1
009             3            1
009             3            2
                           count         1
                        total_count      4

如果您有更好的解决方案/想法,我将不胜感激.

If you have a better solution/idea, I would appreciate.

推荐答案

IIUC,您可以检查 veh_type 中感兴趣的内容和分组依据:

IIUC, you can check veh_type for those of interest and groupby:

(df.assign(car=df.veh_type.eq(1),
          bike=df.veh_type.eq(2))  # change 2 to correct type
   [['acc_index','car','bike']]
   .groupby('acc_index')
   .any()
   .all(1).sum()
)

输出:

4


更新:如果需要所有行:

s = (df.assign(car=df.veh_type.eq(1),
          bike=df.veh_type.eq(2))  # change 2 to correct type
   [['acc_index','car','bike']]
   .groupby('acc_index')
   .any()
   .all(1)
)

df[df['acc_index'].map(s)]

输出:

    acc_index  veh_count  veh_type
1           2          2         1
2           2          2         2
3           3          2         1
4           3          2         2
9           7          2         1
10          7          2         2
13          9          3         1
14          9          3         1
15          9          3         2

这篇关于按条件应用于 pandas 中同一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆