在时间戳中查找丢失的数据 [英] Finding missing data in a Timesstamp

查看:189
本文介绍了在时间戳中查找丢失的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试查看CSV文件,但我想确保所有数据都在那里. CSV时间为15分钟,格式为yyyy-mm-dd-hh:mm.我已经收集了数据并制作了时间戳. lst = list()

I am trying to look though a CSV file, but i want to make sure all the data are there. The CSV time is in 15 Min format is yyyy-mm-dd-hh:mm . I have collectet the data and made timestamp. lst = list()

with open("CHFJPY15.csv", "r") as f:
    f_r = f.read()

    sline = f_r.split()

    for line in sline:
        parts = line.split(',')
        date = parts[0]
        time = parts[1]
        closeingtime = parts[5]

        timestamp = date + time + closeingtime

        lst.append(timestamp)
print(lst, "liste")

如下所示,CSV只是一长串数据.再一次,我真的很想检查每15分钟是否有所有数据.但是我不知道该怎么编码.

As seen below, the CSV is just a long list of data. Again i really want to check that all data is there for every 15 min. But i dont know exactlyhow to code it.

'2015.12.09.19:45 123.251','2015.12.09.20:00 123.188', '2015.12.09.20:15123.192','2015.12.09.20:30 123.242', '2015.12.09.20:45123.166', ..等等.

'2015.12.09.19:45 123.251', '2015.12.09.20:00 123.188', '2015.12.09.20: 15123.192', '2015.12.09.20:30 123.242', '2015.12.09.20: 45123.166', .. etc..

推荐答案

您可能没有注意到该数据列表中的项目格式不一致.例如,在日期和2015.12.09.19:45 123.251中的其他数据之间有空白,但是在2015.12.09.20: 45123.166中的间距却不同.我要假设您会处理.

You might not have noticed that items in that data list are inconsistent in format. For instance, there's white space between the date and the other data in 2015.12.09.19:45 123.251 but the gap is placed differently in 2015.12.09.20: 45123.166. I'm going to assume that you will deal with that.

我首先创建一个与您相似的格式一致的数据项列表.尽管大多数日期之间相隔15分钟,但我还是故意留出一些空隙.

I begin by creating a consistently formatted list of data items similar to yours. Although most of the dates are separated by fifteen minute intervals I deliberately put in some gaps.

>>> from datetime import timedelta
>>> interval = timedelta(minutes=15)
>>> from datetime import datetime
>>> current_time = datetime(2015,12,9,19,30)
>>> data = []
>>> omits = [3,5,9,11,17]
>>> for i in range(20):
...     current_time += interval
...     if i in omits:
...         continue
...     data.append(current_time.strftime('%y.%m.%d.%H:%M')+' 123.456')
...     
>>> data
['15.12.09.19:45 123.456', '15.12.09.20:00 123.456', '15.12.09.20:15 123.456', '15.12.09.20:45 123.456', '15.12.09.21:15 123.456', '15.12.09.21:30 123.456', '15.12.09.21:45 123.456', '15.12.09.22:15 123.456', '15.12.09.22:45 123.456', '15.12.09.23:00 123.456', '15.12.09.23:15 123.456', '15.12.09.23:30 123.456', '15.12.09.23:45 123.456', '15.12.10.00:15 123.456', '15.12.10.00:30 123.456']

现在,我通读日期,并从其前任中减去每个日期.我将第一个前任"设置为"previous"至"now",因为这肯定与其他日期有所不同.

Now I read through the dates subtracting each from it predecessor. I set the first 'predecessor', which I call previous to now because that's bound to differ from the other dates.

我将列表中的每个基准分割成两部分,而忽略了第二部分.使用strptime,我将字符串转换为日期.可以减去日期并比较差异.

I split each datum from the list into two, ignoring the second piece. Using strptime I turn strings into dates. Dates can be subtracted and the differences compared.

>>> previous = datetime.now().strftime('%y.%m.%d.%H:%M')
>>> first = True
>>> for d in data:
...     date_part, other = d.split(' ')
...     if datetime.strptime(date_part, '%y.%m.%d.%H:%M') - datetime.strptime(previous, '%y.%m.%d.%H:%M') != interval:
...         if not first:
...             'unacceptable gap prior to ', date_part
...         else:
...             first = False
...     previous = date_part
...     
('unacceptable gap prior to ', '15.12.09.20:45')
('unacceptable gap prior to ', '15.12.09.21:15')
('unacceptable gap prior to ', '15.12.09.22:15')
('unacceptable gap prior to ', '15.12.09.22:45')
('unacceptable gap prior to ', '15.12.10.00:15')

这篇关于在时间戳中查找丢失的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆