pandas read_csv无法将ISO8601识别为datetime dtype [英] Pandas read_csv not recognizing ISO8601 as datetime dtype
问题描述
当前,我正在使用熊猫将第一列作为索引将csv文件读入DataFrame
.第一列为ISO 8601格式,因此根据 read_csv ,则应将其识别为日期时间:
Currently I am using pandas to read a csv file into a DataFrame
, using the first column as the index. The first column is in ISO 8601 format, so according to the documentation for read_csv, it should be recognized as a datetime:
In [1]: import pandas as pd
In [2]: df = pd.read_csv('data.csv', index_col=0)
In [3]: print df.head()
U V Z Ubar Udir
2014-11-01 00:00:00 0.73 -0.81 0.46 1.0904 317.97
2014-11-01 01:00:00 1.26 -1.50 0.32 1.9590 319.97
2014-11-01 02:00:00 1.50 -1.80 0.13 2.3431 320.19
2014-11-01 03:00:00 1.39 -1.65 0.03 2.1575 319.89
2014-11-01 04:00:00 0.94 -1.08 -0.03 1.4318 318.96
但是,当查询索引dtype时,它返回'object':
However, when querying the index dtype, it returns 'object':
In [4]: print df.index.dtype
object
然后我必须手动将其转换为datetime dtype:
I then have to manually convert it to datetime dtype:
In [5]: df.index = pd.to_datetime(df.index)
In [6]: print df.index.dtype
datetime64[ns]
调用read_csv()
时是否有任何方法可以将索引自动设置为datetime dtype?
Is there any way to automatically have the index set to datetime dtype when calling read_csv()
?
推荐答案
read_csv documentation describes parse_dates parameter:
parse_dates:布尔值或整数列表或名称列表或列表列表或字典,默认为False
-布尔值.如果为True->尝试解析索引.
-整数或名称列表.例如如果[1,2,3]->尝试将第1、2、3列解析为单独的日期列.
-列表清单.例如如果[[1,3]]->合并列1和3并解析为 一个日期列.
-dict,例如{'foo':[1,3]}->将第1、3列解析为日期,并调用结果"foo"
注意:存在iso8601格式日期的快速路径.
parse_dates : boolean or list of ints or names or list of lists or dict, default False
- boolean. If True -> try parsing the index.
- list of ints or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
- list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
- dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’
Note: A fast-path exists for iso8601-formatted dates.
由于要解析索引,因此可以使用:
Since you want to parse index you can use:
import pandas as pd
df = pd.read_csv('data.csv', index_col=0, parse_dates=True)
这篇关于 pandas read_csv无法将ISO8601识别为datetime dtype的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!