使用Pandas GroupBy和VALUE_COUNTS查找最常用的值 [英] Finding most common values with Pandas GroupBy and value_counts

查看:31
本文介绍了使用Pandas GroupBy和VALUE_COUNTS查找最常用的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用表中的两列。

+-------------+--------------------------------------------------------------+
|  Area Name  |                       Code Description                       |
+-------------+--------------------------------------------------------------+
| N Hollywood | VIOLATION OF RESTRAINING ORDER                               |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| N Hollywood | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT               |
| Southeast   | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT               |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| 77th Street | RAPE, FORCIBLE                                               |
| Foothill    | CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060 |
| N Hollywood | VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS) 0114 |
+-------------+--------------------------------------------------------------+

我正在使用Groupby和Value_Counts按区域名称查找代码说明。

df.groupby(['Area Name'])['Code Description'].value_counts()

有没有办法只查看每个区域名称的前‘n’个值?如果我将.nlargest(3)追加到上面的代码,它只返回一个区域名称的结果。

+---------------------------------------------------------------------------------+
| Wilshire     SHOPLIFTING-GRAND THEFT ($950.01 & OVER)                         7 |
+---------------------------------------------------------------------------------+

推荐答案

使用value_counts结果中的head每组:

df.groupby('Area Name')['Code Description'].apply(lambda x: x.value_counts().head(3))

输出:

Area Name                                                                
77th Street  RAPE, FORCIBLE                                                  1
Foothill     CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060    1
N Hollywood  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
             VIOLATION OF RESTRAINING ORDER                                  1
             ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
Southeast    ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
West Valley  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
Name: Code Description, dtype: int64

这篇关于使用Pandas GroupBy和VALUE_COUNTS查找最常用的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆