计算 pandas 数据框中的唯一日期 [英] Count unique dates in pandas dataframe

查看:86
本文介绍了计算 pandas 数据框中的唯一日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个按台站标识符代码和日期组织的地面天气观测(fzraHrObs)数据框. fzraHrObs具有几列天气数据.电台代码和日期(日期时间对象)如下:

I have a dataframe of surface weather observations (fzraHrObs) organized by a station identifier code and date. fzraHrObs has several columns of weather data. The station code and date (datetime objects) look like:

usaf      dat
716270    2014-11-23 12:00:00
          2015-12-20 08:00:00
          2015-12-20 09:00:00
          2015-12-21 04:00:00
          2015-12-28 03:00:00
716280    2015-12-19 08:00:00
          2015-12-19 08:00:00

我想统计每个站点每年唯一的日期(天)的数量-即每个站点每年obs的天数.在上面的示例中,这将给我:

I would like to get a count of the number of unique dates (days) per year for each station - i.e. the number of days of obs per year at each station. In my example above this would give me:

    usaf      Year     Count
    716270    2014     1
              2015     3
    716280    2014     0
              2015     1

我尝试使用groupby并按电台,年份和日期分组: grouped = fzraHrObs['dat'].groupby(fzraHrObs['usaf'], fzraHrObs.dat.dt.year, fzraHrObs.dat.dt.date])

I've tried using groupby and grouping by station, year, and date: grouped = fzraHrObs['dat'].groupby(fzraHrObs['usaf'], fzraHrObs.dat.dt.year, fzraHrObs.dat.dt.date])

在此计数,大小,唯一性等给了我每个日期的obs数,而不是每年的日期数.在这里得到我想要的任何建议吗?

Count, size, nunique, etc. on this just gives me the number of obs on each date, not the number of dates themselves per year. Any suggestions on getting what I want here?

推荐答案

可能是这样,将日期按usafyear分组,然后计算唯一值的数量:

Could be something like this, group the date by usaf and year and then count the number of unique values:

import pandas as pd
df.dat.apply(lambda dt: dt.date()).groupby([df.usaf, df.dat.apply(lambda dt: dt.year)]).nunique()

#   usaf   dat 
# 716270  2014    1
#         2015    3
# 716280  2015    1
# Name: dat, dtype: int64

这篇关于计算 pandas 数据框中的唯一日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆