在具有历史日期的 pandas 中阅读CSV文件 [英] Reading CSV file in Pandas with historical dates

查看：127 发布时间：2017/4/8 19:08:15 python date csv pandas

本文介绍了在具有历史日期的 pandas 中阅读CSV文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试以（UK）格式13/01/1800中的日期读取文件，但是某些日期在1667之前，不能以纳秒时间戳表示（请参阅 http://pandas.pydata.org/pandas-docs/stable/gotchas。 html＃gotchas-timestamp-limits ）。我从该页面了解到，我需要创建自己的PeriodIndex来覆盖我需要的范围（请参阅 http://pandas.pydata.org/pandas-docs/stable/timeseries.html#timeseries-oob ），但我无法理解我如何将csv reader中的字符串转换为日期在此期间指数。

I'm trying to read a file in with dates in the (UK) format 13/01/1800, however some of the dates are before 1667, which cannot be represented by the nanosecond timestamp (see http://pandas.pydata.org/pandas-docs/stable/gotchas.html#gotchas-timestamp-limits). I understand from that page I need to create my own PeriodIndex to cover the range I need (see http://pandas.pydata.org/pandas-docs/stable/timeseries.html#timeseries-oob) but I can't understand how I convert the string in the csv reader to a date in this periodindex.

到目前为止，我有：

span = pd.period_range('1000-01-01', '2100-01-01', freq='D')
df_earliest= pd.read_csv("objects.csv", index_col=0, names=['Object Id', 'Earliest Date'], parse_dates=[1], infer_datetime_format=True, dayfirst=True)

如何将span应用于日期阅读器/转换器，以便我可以在数据框中创建一个PeriodIndex / DateTimeIndex列？

How do I apply the span to the date reader/converter so I can create a PeriodIndex / DateTimeIndex column in the dataframe ?

推荐答案

你可以尝试这样做：

fn = r'D:\temp\.data\36987699.csv'

def dt_parse(s):
    d,m,y = s.split('/')
    return pd.Period(year=int(y), month=int(m), day=int(d), freq='D')


df = pd.read_csv(fn, parse_dates=[0], date_parser=dt_parse)

输入文件：

Date,col1
13/01/1800,aaa
25/12/1001,bbb
01/03/1267,ccc

测试：

In [16]: df
Out[16]:
        Date col1
0 1800-01-13  aaa
1 1001-12-25  bbb
2 1267-03-01  ccc

In [17]: df.dtypes
Out[17]:
Date    object
col1    object
dtype: object

In [18]: df['Date'].dt.year
Out[18]:
0    1800
1    1001
2    1267
Name: Date, dtype: int64

PS你可能想添加 try ... catch dt_parse（）函数捕获 ValueError：异常 - int的结果（） ...

PS you may want to add try ... catch block in the dt_parse() function for catching ValueError: exceptions - result of int()...

这篇关于在具有历史日期的 pandas 中阅读CSV文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在具有历史日期的 pandas 中阅读CSV文件 [英] Reading CSV file in Pandas with historical dates

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在具有历史日期的 pandas 中阅读CSV文件 [英] Reading CSV file in Pandas with historical dates

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭