如何从导入的csv文件 - pandas索引datetime列 [英] How to index a datetime column from imported csv file - pandas

查看：751 发布时间：2017/2/26 15:34:58 csv datetime pandas indexing concatenation

本文介绍了如何从导入的csv文件 - pandas索引datetime列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想合并&追加不同的时间序列，从csv文件导入它们。我试过下面的基本代码：

I am trying to merge & append different timeseries, importing them from csv files. I have tried the following basic code:

import pandas as pd
import numpy as np
import glob
import csv
import os

path = r'./A08_csv'     # use your path
#all_files = glob.glob(os.path.join(path, "A08_B1_T5.csv"))

df5 = pd.read_csv('./A08_csv/A08_B1_T5.csv', parse_dates={'Date Time'})
df6 = pd.read_csv('./A08_csv/A08_B1_T6.csv', parse_dates={'Date Time'})

print len(df5)
print len(df6)

df = pd.concat([df5],[df6], join='outer')
print len(df)

，结果是：

12755 (df5)
24770 (df6)
12755 (df)

只要两个文件中最长的一个它们有很多共同的行，在['Date Time']列上的值）

Shouldn't df as long as the longest of the two files (which have lots of rows in common, in terms of values on ['Date Time'] column)??

我试图根据datetime索引数据，添加这行：

I have tried to index the data based on datetime, adding this line:

#df5.set_index(pd.DatetimeIndex(df5['Date Time']))

但我收到错误：

KeyError: 'Date Time'

任何线索为什么会发生这种情况？

Any clue on why this happens?

推荐答案

我认为您需要：

df5.set_index(['Date Time'], inplace=True)

=http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html =nofollow> read_csv 添加参数 index_col ：

Or better in read_csv add parameter index_col:

import pandas as pd
import io

temp=u"""Date Time,a
2010-01-27 16:00:00,2.0
2010-01-27 16:10:00,2.2
2010-01-27 16:30:00,1.7"""

df = pd.read_csv(io.StringIO(temp), index_col=['Date Time'], parse_dates=['Date Time'])
print (df)
                       a
Date Time               
2010-01-27 16:00:00  2.0
2010-01-27 16:10:00  2.2
2010-01-27 16:30:00  1.7

print (df.index)
DatetimeIndex(['2010-01-27 16:00:00', '2010-01-27 16:10:00',
               '2010-01-27 16:30:00'],
              dtype='datetime64[ns]', name='Date Time', freq=None)

<如果列日期时间为第一个，请将 0 添加到 index_col 和 parse_dates （从 0 ）的python计数：

Another solution is add to paramaters column by order - if column Date Time is first, add 0 to index_col and parse_dates (python count from 0):

import pandas as pd
import io


temp=u"""Date Time,a
2010-01-27 16:00:00,2.0
2010-01-27 16:10:00,2.2
2010-01-27 16:30:00,1.7"""

df = pd.read_csv(io.StringIO(temp), index_col=0, parse_dates=[0])
print (df)
                       a
Date Time               
2010-01-27 16:00:00  2.0
2010-01-27 16:10:00  2.2
2010-01-27 16:30:00  1.7

print (df.index)
DatetimeIndex(['2010-01-27 16:00:00', '2010-01-27 16:10:00',
               '2010-01-27 16:30:00'],
              dtype='datetime64[ns]', name='Date Time', freq=None)

这篇关于如何从导入的csv文件 - pandas索引datetime列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从导入的csv文件 - pandas索引datetime列 [英] How to index a datetime column from imported csv file - pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从导入的csv文件 - pandas索引datetime列 [英] How to index a datetime column from imported csv file - pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭