使用PANDAS中的value_counts()零次出现/频率 [英] Zero occurrences/frequency using value_counts() in PANDAS

查看:388
本文介绍了使用PANDAS中的value_counts()零次出现/频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表格,其中包含日期和每个日期售出的各种汽车,格式如下(这些列中只有2列):

I have a table containing dates and the various cars sold on each dates in the following format (These are only 2 of many columns):

DATE       CAR
2012/01/01 BMW
2012/01/01 Mercedes Benz
2012/01/01 BMW
2012/01/02 Volvo
2012/01/02 BMW
2012/01/03 Mercedes Benz
...
2012/09/01 BMW
2012/09/02 Volvo

我执行以下操作以查找每天售出的宝马汽车数量

I perform the following operation to find the number of BMW cars sold everyday

df[df.CAR=='BMW']['DATE'].value_counts()

结果是这样的:

2012/07/04 15
2012/07/08 8
...
2012/01/02 1

但是有几天没有宝马汽车售出.结果,除了上述内容之外,我还希望BMW零发生的日子.因此,理想的结果是:

But there are some days when no BMW car was sold. In the result, along with the above I want the days where there are zero occurrences of BMW. Therefore, the desired result is :

2012/07/04 15
2012/07/08 8
...
2012/01/02 1
2012/01/09 0
2012/08/11 0

我该怎么做才能获得这样的结果?

What can I do to attain such a result?

推荐答案

您可以在value_counts之后重新索引结果,并用0填充缺少的值.

You can reindex the result after value_counts and fill the missing values with 0.

df.loc[df.CAR == 'BMW', 'DATE'].value_counts().reindex(
    df.DATE.unique(), fill_value=0)

输出:

2012/01/01    2
2012/01/02    1
2012/01/03    0
2012/09/01    1
2012/09/02    0
Name: DATE, dtype: int64


除了value_counts,您还可以考虑检查是否相等并求和,并按日期分组,日期将包括所有这些日期.


Instead of value_counts you could also consider checking the equality and summing, grouped by the dates, which will include all of them.

df['CAR'].eq('BMW').astype(int).groupby(df['DATE']).sum()

输出:

DATE
2012/01/01    2
2012/01/02    1
2012/01/03    0
2012/09/01    1
2012/09/02    0
Name: CAR, dtype: int32

这篇关于使用PANDAS中的value_counts()零次出现/频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆