优雅而高效的方式按原样保留日期值,而不会出现OOB错误 [英] Elegant and Efficient way retain date values as is without OOB error

查看:93
本文介绍了优雅而高效的方式按原样保留日期值,而不会出现OOB错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据框

I have a dataframe which is like as shown below

df1_new = pd.DataFrame({'person_id': [1, 1, 3, 3, 5, 5],'obs_date': ['7/23/2377  12:00:00 AM', 'NA-NA-NA NA:NA:NA', 'NA-NA-NA NA:NA:NA', '7/27/2277  12:00:00 AM', '7/13/2077  12:00:00 AM', 'NA-NA-NA NA:NA:NA']})

如您所见,我的日期值很少是out of bound值.但是,我仍然想保留它们.不幸的是,由于OOB问题,我无法做到

As you can see few of my date values are out of bound values. However, I would still like to retain them as it is. Unfortunately, I couldn't due to OOB issue

我在下面尝试过

pd.to_datetime(df1_new['obs_date'], format='%m/%d/%Y %I:%M:%S %p', errors='coerce')

是否还有其他有效的方式可以保留日期值,而仅通过更改格式即可?如果可以是字符串列/数据类型,我很好

Is there any other efficient way to retain the date value as is but by changing the format alone? I am fine if it can be string column/datatype

我希望我的输出如下所示.

I expect my output to be like as shown below.

更新了尝试/除屏幕截图

推荐答案

您可以将值转换为日期时间,然后转换为日期Period,以便仅将表示out of bound值的熊猫格式转换为

You can convert values to datetimes and then to day Period for only possible format in pandas for represent out of bound values.

如果忽略它,则使用python datetimes对象,而不使用pandas datetimes(时间戳).

If omit it, then working with python datetimes objects, not with pandas datetimes (timestamps).

from datetime import datetime
def str2time(x):
    try:
        return pd.Period(datetime.strptime(x, '%m/%d/%Y %I:%M:%S %p'), 'D')
    except:
        return np.nan

df1_new['obs_date'] = df1_new['obs_date'].apply(str2time)
print(df1_new)
   person_id    obs_date
0          1  2377-07-23
1          1         NaT
2          3         NaT
3          3  2277-07-27
4          5  2077-07-13
5          5         NaT

print(df1_new['obs_date'].dtype)
period[D]

如果可能的话,多种格式:

If possible multiple formats:

def str2time(x):
    try:
        #MM/DD/YYYY II:MM:SS pp like 7/23/2377  12:00:00 AM
        return pd.Period(datetime.strptime(x, '%m/%d/%Y %I:%M:%S %p'), 'D')
    except:
        try:
            #YYYY-MM-DD HH:MM:SS like 2377-07-23 00:00:00
            return pd.Period(datetime.strptime(x, '%Y-%m-%d %H:%M:%S'), 'D')
        except:
            return np.nan

df1_new['obs_date'] = df1_new['obs_date'].apply(str2time)

这篇关于优雅而高效的方式按原样保留日期值,而不会出现OOB错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆