使用PANDAS中的value_counts()零次出现/频率 [英] Zero occurrences/frequency using value_counts() in PANDAS
问题描述
我有一个表格,其中包含日期和每个日期售出的各种汽车,格式如下(这些列中只有2列):
I have a table containing dates and the various cars sold on each dates in the following format (These are only 2 of many columns):
DATE CAR
2012/01/01 BMW
2012/01/01 Mercedes Benz
2012/01/01 BMW
2012/01/02 Volvo
2012/01/02 BMW
2012/01/03 Mercedes Benz
...
2012/09/01 BMW
2012/09/02 Volvo
我执行以下操作以查找每天售出的宝马汽车数量
I perform the following operation to find the number of BMW cars sold everyday
df[df.CAR=='BMW']['DATE'].value_counts()
结果是这样的:
2012/07/04 15
2012/07/08 8
...
2012/01/02 1
但是有几天没有宝马汽车售出.结果,除了上述内容之外,我还希望BMW零发生的日子.因此,理想的结果是:
But there are some days when no BMW car was sold. In the result, along with the above I want the days where there are zero occurrences of BMW. Therefore, the desired result is :
2012/07/04 15
2012/07/08 8
...
2012/01/02 1
2012/01/09 0
2012/08/11 0
我该怎么做才能获得这样的结果?
What can I do to attain such a result?
推荐答案
您可以在value_counts
之后重新索引结果,并用0填充缺少的值.
You can reindex the result after value_counts
and fill the missing values with 0.
df.loc[df.CAR == 'BMW', 'DATE'].value_counts().reindex(
df.DATE.unique(), fill_value=0)
输出:
2012/01/01 2
2012/01/02 1
2012/01/03 0
2012/09/01 1
2012/09/02 0
Name: DATE, dtype: int64
除了value_counts
,您还可以考虑检查是否相等并求和,并按日期分组,日期将包括所有这些日期.
Instead of value_counts
you could also consider checking the equality and summing, grouped by the dates, which will include all of them.
df['CAR'].eq('BMW').astype(int).groupby(df['DATE']).sum()
输出:
DATE
2012/01/01 2
2012/01/02 1
2012/01/03 0
2012/09/01 1
2012/09/02 0
Name: CAR, dtype: int32
这篇关于使用PANDAS中的value_counts()零次出现/频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!