自UTC午夜以来将数据时间转换为毫秒,或使用Pandas将数据时间本地化为CSV文件 [英] Convert datatime to milliseconds since midnight UTC or localized in CSV file using Pandas

查看:208
本文介绍了自UTC午夜以来将数据时间转换为毫秒,或使用Pandas将数据时间本地化为CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import pandas as pd
import numpy as np
from datetime import datetime, time


# history file and batch size for processing.

historyFilePath = 'EURUSD.SAMPLE.csv'
batch_size = 5000


# function for date parsing
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f')


# load data into a pandas iterator with all the chunks
ratesFromCSVChunks = pd.read_csv(historyFilePath, index_col=0, engine='python', parse_dates=True,
                                 date_parser=dateparse, header=None,
                                 names=["datetime", "1_Current", "2_BidPx", "3_BidSz", "4_AskPx", "5_AskSz"],
                                 iterator=True,
                                 chunksize=batch_size)



# concatenate chunks to get the final array
ratesFromCSV = pd.concat([chunk for chunk in ratesFromCSVChunks])

# save final csv file
df.to_csv('EURUSD_processed.csv', date_format='%Y-%m-%d %H:%M:%S.%f',
             columns=['1_Current', '2_BidPx', '3_BidSz', '4_AskPx', '5_AskSz'], header=False, float_format='%.5f')

我正在读取包含

    2014-08-17 17:00:01.000000,1.33910,1.33910,1.00000,1.33930,1.00000
    2014-08-17 17:00:01.000000,1.33910,1.33910,1.00000,1.33950,1.00000
    2014-08-17 17:00:02.000000,1.33910,1.33910,1.00000,1.33930,1.00000
    2014-08-17 17:00:02.000000,1.33900,1.33900,1.00000,1.33940,1.00000
    2014-08-17 17:00:04.000000,1.33910,1.33910,1.00000,1.33950,1.00000
    2014-08-17 17:00:05.000000,1.33930,1.33930,1.00000,1.33950,1.00000
    2014-08-17 17:00:06.000000,1.33920,1.33920,1.00000,1.33960,1.00000
    2014-08-17 17:00:06.000000,1.33910,1.33910,1.00000,1.33950,1.00000
    2014-08-17 17:00:08.000000,1.33900,1.33900,1.00000,1.33942,1.00000
    2014-08-17 17:00:16.000000,1.33900,1.33900,1.00000,1.33940,1.00000

在保存时,如何将正在读取的CSV文件或熊猫数据框中的数据时间从MIDNIGHT(UTC或本地化)转换为MILLISECONDS中的EPOCH时间.每个文件每天从午夜开始.唯一要更改的是日期时间的格式,从每天的午夜起(到UTC或本地化),以毫秒为单位.我正在寻找的格式是:

How do you convert from Datatime in the CSV file or pandas dataframe being read to EPOCH time in MILLISECONDS from MIDNIGHT ( UTC or localized ) by the time it is being saved. Each file Starts at Midnight every day . The only thing being changed is the format of datetime to miilliseconds from midnight every day( UTC or localized) . The format i am looking for is:

    43264234, 1.33910,1.33910,1.00000,1.33930,1.00000
    43264739, 1.33910,1.33910,1.00000,1.33950,1.00000
    43265282, 1.33910,1.33910,1.00000,1.33930,1.00000
    43265789, 1.33900,1.33900,1.00000,1.33940,1.00000
    43266318, 1.33910,1.33910,1.00000,1.33950,1.00000
    43266846, 1.33930,1.33930,1.00000,1.33950,1.00000
    43267353, 1.33920,1.33920,1.00000,1.33960,1.00000
    43267872, 1.33910,1.33910,1.00000,1.33950,1.00000
    43268387, 1.33900,1.33900,1.00000,1.33942,1.00000

我们非常感谢您的帮助(在Pandas 0.18.1和numpy 1.11中,在Python 3.5或Python 3.4及更高版本中简短准确)

Any help is well appreciated ( short & precise in Python 3.5 or Python 3.4 and above with Pandas 0.18.1 and numpy 1.11 )

推荐答案

此代码段应该是您想要的

This snippet of code should be what you want

# Create some fake data, similar to yours

import pandas as pd
s = pd.Series(pd.date_range('2014-08-17 17:00:01.1230000', periods=4))
print(s)
print(type(s[0]))

# Create a new series using just the date portion of the original data.
# This effectively truncates the time portion. 
# Can't use d = s.dt.date or you'll get date objects back, not datetime64.

d = pd.to_datetime(s.dt.date)
print(d)
print(type(d[0]))

# Calculate the time delta between the original datetime and 
# just the date portion. This is the elapsed time since your epoch.

delta_t = s-d
print(delta_t)

# Display the elapsed time as seconds.

print(delta_t.dt.total_seconds())

这将导致以下输出

0   2014-08-17 17:00:01.123
1   2014-08-18 17:00:01.123
2   2014-08-19 17:00:01.123
3   2014-08-20 17:00:01.123
dtype: datetime64[ns]
<class 'pandas.tslib.Timestamp'>
0   2014-08-17
1   2014-08-18
2   2014-08-19
3   2014-08-20
dtype: datetime64[ns]
<class 'pandas.tslib.Timestamp'>
0   17:00:01.123000
1   17:00:01.123000
2   17:00:01.123000
3   17:00:01.123000
dtype: timedelta64[ns]
0    61201.123
1    61201.123
2    61201.123
3    61201.123
dtype: float64

这篇关于自UTC午夜以来将数据时间转换为毫秒,或使用Pandas将数据时间本地化为CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆