检查是否所有日期都存在于python的一年中 [英] Check whether all dates are present in a year in pandas python

查看:66
本文介绍了检查是否所有日期都存在于python的一年中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据列,其中缺少一些日期.

I have a data column like below, in which some dates are missing.

obstime

2012-01-01

2012-01-01

2012-01-02

2012-01-02

2012-01-03

2012-01-03

2012-01-04

2012-01-04

....

2016-12-28

2016-12-28

2016-12-29

2016-12-29

2016-12-30

2016-12-30

2016-12-31

2016-12-31

我想检查可用年份中每个月的所有日期.就像下面的图片一样

I want to check for all dates for each month for available years. Like in the following image

推荐答案

我的解决方案基于熊猫,不使用任何数据库.

My solution is based on Pandas, without any use of databases.

该想法是使用完整"索引(使用年份范围内的所有日期).为此测试,我使用了日期从2016年和2017年开始.

The idea is to reindex the source Dataframe, using "full" index (with all dates from the year range). For this test purpose, I used dates from year 2016 and 2017.

然后,我们仅保留刚刚添加"的行,并带有不存在"测量的日期.

Then we leave only "just added" rows, with dates for "absent" measurements.

其余操作为:

  • 按月分组,应用一个函数生成日期范围.
  • 转换为具有提取的"年份和月份的DataFrame.
  • 透视数据框(以月为索引,以年为列).
  • 添加月份名称并将其设置为索引.

因此整个脚本可以如下:

So the whole script can be as follows:

import pandas as pd
import calendar

# Function to be applied to date groups for each month
def fun(x):
    dt = x.result
    day = pd.Timedelta('1d')
    startDates = dt[dt.diff() != day]
    if startDates.size > 0:
        endDates = dt[(dt - dt.shift(-1)).abs() != day]
        return '&'.join([(f'{s.day}-{e.day}') for s, e in zip(startDates, endDates)])
    else:
        return 'OK'

# Source dates
dates = pd.date_range('2016-01-01', '2016-01-13')\
    .append(pd.date_range('2016-01-20', '2016-01-29'))\
    .append(pd.date_range('2016-02-10', '2016-02-20'))\
    .append(pd.date_range('2016-03-11', '2017-11-20'))\
    .append(pd.date_range('2017-11-25', '2017-12-31'))
# Source DataFrame with random results for dates given
df = pd.DataFrame(data={ 'result': np.random.randint(10, 30, len(dates))},
    index=dates)
# Index for full range of dates
idxFull = pd.date_range('2016-01-01', '2017-12-31')
# "Expand" to all dates
df2 = df.reindex(idxFull)
# Leave only "empty" rows
df2.drop(df2[df2.result.notna()].index, inplace=True)
# Copy index to result
df2.result = df2.index
# Group by months
gr = df2.groupby(pd.Grouper(freq='M'))
# Result - Series
res = gr.apply(fun)
# Result - DataFrame with year/month "extracted" from date
res2 = pd.DataFrame(data={'res': res, 'year': res.index.year,
    'month': res.index.month })
# Result - pivot'ed res2
res3 = res2.pivot(index='month', columns='year').fillna('OK')
# Add month names
res3['MonthName'] = list(calendar.month_name)[1:]
# Set month names as index
res3.set_index('MonthName', inplace=True)

当您 print(res3)时,结果为:

                   res       
year              2016   2017
MonthName                    
January    14-19&30-31     OK
February     1-9&21-29     OK
March             1-10     OK
April               OK     OK
May                 OK     OK
June                OK     OK
July                OK     OK
August              OK     OK
September           OK     OK
October             OK     OK
November            OK  21-24
December            OK     OK

这篇关于检查是否所有日期都存在于python的一年中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆