在时间戳中查找丢失的数据 [英] Finding missing data in a Timesstamp
问题描述
我正在尝试查看CSV文件,但我想确保所有数据都在那里. CSV时间为15分钟,格式为yyyy-mm-dd-hh:mm.我已经收集了数据并制作了时间戳. lst = list()
I am trying to look though a CSV file, but i want to make sure all the data are there. The CSV time is in 15 Min format is yyyy-mm-dd-hh:mm . I have collectet the data and made timestamp. lst = list()
with open("CHFJPY15.csv", "r") as f:
f_r = f.read()
sline = f_r.split()
for line in sline:
parts = line.split(',')
date = parts[0]
time = parts[1]
closeingtime = parts[5]
timestamp = date + time + closeingtime
lst.append(timestamp)
print(lst, "liste")
如下所示,CSV只是一长串数据.再一次,我真的很想检查每15分钟是否有所有数据.但是我不知道该怎么编码.
As seen below, the CSV is just a long list of data. Again i really want to check that all data is there for every 15 min. But i dont know exactlyhow to code it.
'2015.12.09.19:45 123.251','2015.12.09.20:00 123.188', '2015.12.09.20:15123.192','2015.12.09.20:30 123.242', '2015.12.09.20:45123.166', ..等等.
'2015.12.09.19:45 123.251', '2015.12.09.20:00 123.188', '2015.12.09.20: 15123.192', '2015.12.09.20:30 123.242', '2015.12.09.20: 45123.166', .. etc..
推荐答案
您可能没有注意到该数据列表中的项目格式不一致.例如,在日期和2015.12.09.19:45 123.251
中的其他数据之间有空白,但是在2015.12.09.20: 45123.166
中的间距却不同.我要假设您会处理.
You might not have noticed that items in that data list are inconsistent in format. For instance, there's white space between the date and the other data in 2015.12.09.19:45 123.251
but the gap is placed differently in 2015.12.09.20: 45123.166
. I'm going to assume that you will deal with that.
我首先创建一个与您相似的格式一致的数据项列表.尽管大多数日期之间相隔15分钟,但我还是故意留出一些空隙.
I begin by creating a consistently formatted list of data items similar to yours. Although most of the dates are separated by fifteen minute intervals I deliberately put in some gaps.
>>> from datetime import timedelta
>>> interval = timedelta(minutes=15)
>>> from datetime import datetime
>>> current_time = datetime(2015,12,9,19,30)
>>> data = []
>>> omits = [3,5,9,11,17]
>>> for i in range(20):
... current_time += interval
... if i in omits:
... continue
... data.append(current_time.strftime('%y.%m.%d.%H:%M')+' 123.456')
...
>>> data
['15.12.09.19:45 123.456', '15.12.09.20:00 123.456', '15.12.09.20:15 123.456', '15.12.09.20:45 123.456', '15.12.09.21:15 123.456', '15.12.09.21:30 123.456', '15.12.09.21:45 123.456', '15.12.09.22:15 123.456', '15.12.09.22:45 123.456', '15.12.09.23:00 123.456', '15.12.09.23:15 123.456', '15.12.09.23:30 123.456', '15.12.09.23:45 123.456', '15.12.10.00:15 123.456', '15.12.10.00:30 123.456']
现在,我通读日期,并从其前任中减去每个日期.我将第一个前任"设置为"previous
"至"now
",因为这肯定与其他日期有所不同.
Now I read through the dates subtracting each from it predecessor. I set the first 'predecessor', which I call previous
to now
because that's bound to differ from the other dates.
我将列表中的每个基准分割成两部分,而忽略了第二部分.使用strptime
,我将字符串转换为日期.可以减去日期并比较差异.
I split each datum from the list into two, ignoring the second piece. Using strptime
I turn strings into dates. Dates can be subtracted and the differences compared.
>>> previous = datetime.now().strftime('%y.%m.%d.%H:%M')
>>> first = True
>>> for d in data:
... date_part, other = d.split(' ')
... if datetime.strptime(date_part, '%y.%m.%d.%H:%M') - datetime.strptime(previous, '%y.%m.%d.%H:%M') != interval:
... if not first:
... 'unacceptable gap prior to ', date_part
... else:
... first = False
... previous = date_part
...
('unacceptable gap prior to ', '15.12.09.20:45')
('unacceptable gap prior to ', '15.12.09.21:15')
('unacceptable gap prior to ', '15.12.09.22:15')
('unacceptable gap prior to ', '15.12.09.22:45')
('unacceptable gap prior to ', '15.12.10.00:15')
这篇关于在时间戳中查找丢失的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!