Pandas:如何从另一个数据框中获取出现次数? [英] Pandas: How to get count of occurrence from another data frame?

查看:50
本文介绍了Pandas:如何从另一个数据框中获取出现次数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Python Pandas.我有 2 个数据框(即:df1、df2).'df1' 包含标题级数据,如卡 ID、发行日期等. 'df2' 具有粒度级数据,例如由特定卡 ID 执行的每笔交易.'Card-id' 在两个数据帧之间很常见.

I am using Python Pandas. I have 2 data-frames (namely: df1, df2). 'df1' contains header-level data, like card-id, issued-on date etc. 'df2' has granular-level data, like each transaction performed by a specific card-id. 'Card-id' is common between the two dataframes.

df1:
 first_active_month          card_id  feature_1  feature_2  feature_3 
            2017-06  C_ID_92a2005557          5          2          1   
            2017-01  C_ID_3d0044924f          4          1          0   
            2016-08  C_ID_d639edf6cd          2          2          0   
            2017-09  C_ID_186d6a6901          4          3          0   
            2017-11  C_ID_cdbd2c0db2          1          3          0

df2:
   junk_id   authorized_flag          card_id  city_id Authorized 
    13292136               Y  C_ID_92a2005557      101          N   
    20069042               Y  C_ID_7a238b3713       69          N   
     5029656               Y  C_ID_92a2005557       17          N   
    16356907               N  C_ID_3d0044924f       -1          Y   
     8203441               Y  C_ID_fcf33361c2       17          N

我想向 df1 添加一列频率",它将显示 df2 中 df1 的每个卡 ID 的出现次数.因此,df1 应如下所示:

I want to add a column "frequency" to df1 which will show me a count of occurrences of each card-id of df1 in df2. So, df1 should look like below:

df1 (after executing the command):
 first_active_month          card_id  feature_1  feature_2  feature_3    frequency
            2017-06  C_ID_92a2005557          5          2          1      2
            2017-01  C_ID_3d0044924f          4          1          0      5
            2016-08  C_ID_d639edf6cd          2          2          0      3
            2017-09  C_ID_186d6a6901          4          3          0      1
            2017-11  C_ID_cdbd2c0db2          1          3          0      7

请注意:我是 Python/Pandas 的新手.我已经浏览了这个站点的多个线程,但它们都指的是在同一个数据帧中计数.我正在寻找使用加入/合并功能的计数.我已经浏览过的主题:thisthis这个这个这个这个这个.

Please note: I am new to Python / Pandas. I have already gone through multiple threads of this site, but all of them referred to counting in the same data-frame. I am looking for a counting using join/merge functionality. Threads which I have already browsed: this, this, this, this, this, this, this.

推荐答案

我认为你需要 Series.mapSeries.value_countsSeries.fillna 用于替换缺失值:

I think you need Series.map with Series.value_counts and Series.fillna for replace missing values:

df1['frequency'] = df1['card_id'].map(df2['card_id'].value_counts()).fillna(0).astype(int)
print (df1)
  first_active_month          card_id  feature_1  feature_2  feature_3  \
0            2017-06  C_ID_92a2005557          5          2          1   
1            2017-01  C_ID_3d0044924f          4          1          0   
2            2016-08  C_ID_d639edf6cd          2          2          0   
3            2017-09  C_ID_186d6a6901          4          3          0   
4            2017-11  C_ID_cdbd2c0db2          1          3          0   

   frequency  
0          2  
1          1  
2          0  
3          0  
4          0  

这篇关于Pandas:如何从另一个数据框中获取出现次数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆