pandas.to_datetime 给出了 OutOfBoundsDatetime 错误 [英] pandas.to_datetime gives OutOfBoundsDatetime Error

查看:116
本文介绍了pandas.to_datetime 给出了 OutOfBoundsDatetime 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有某种格式的数据,我想将其读入 pandas.DataFrame.有些行给我一个错误.下面是这些字符串之一的最小示例,但我有几个它不起作用的地方(奇怪的是,有些地方它起作用了).

I have data in some format which I want to read into a pandas.DataFrame. Some rows give me an error. Below is a minimal example for one of those strings, but i have several where it does not work (and strangely enough some where it does work).

确切的错误是:

OutOfBoundsDatetime,越界纳秒时间戳:2276-02-1805:15:13

OutOfBoundsDatetime, Out of bounds nanosecond timestamp: 2276-02-18 05:15:13

import pandas as pd 
pd.to_datetime('02/18/2276 5:15:13 AM', format='%m/%d/%Y %I:%M:%S %p')

我使用这个网站来制作我的格式字符串:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Period.strftime.html

I used this site to make my format-string: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Period.strftime.html

感谢您的帮助!

推荐答案

这是越界的,因为 datetime dtype 是 datetime64[ns] 它有一个上限2262 年的限制见 docs 如果您将分辨率更改为较低的分辨率,然后它可以处理此日期时间,但不幸的是,您无法在 pandas 中执行此操作.由于 datetime 被本地存储为 datetime64[ns],您必须在 numpy 中或使用正常的日期时间来执行此操作.

This is out of bounds because the datetime dtype is datetime64[ns] which has an upper bound limit of year 2262 see the docs if you change the resolution to a lower resolution then it can handle this datetime but you can't do this within pandas unfortunately. As datetimes are stored natively as datetime64[ns], you'd have to do this within numpy or using a normal datetime.

另一种方法是,如果年份超出范围,则将年份存储在单独的列中,并将年份值设置为 1900 或其他指示年份超出范围的指示符.

Another method is to store the year in a separate column if it's outside of the bounds and set the year value to 1900 or some other indicator that the year is out of bounds.

但是,这会带来性能问题,因为您丢失了一些矢量化操作

However, this has performance issues as you lost some vectorised operations

这篇关于pandas.to_datetime 给出了 OutOfBoundsDatetime 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆