在多个datetime组python中查找datetime实例 [英] Find datetime instances in multiple datetime groups python

查看:154
本文介绍了在多个datetime组python中查找datetime实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个CSV文件,其中有 str 格式的时间戳记数据。
第一个CSV_1已将来自pandas时间库的数据重新采样为15分钟块,如下所示:

I have two CSV files with timestamp data in str format. the first CSV_1 has resampled data from a pandas timeseries, into 15 minute blocks and looks like:

time            ave_speed   
1/13/15 4:30    34.12318398 
1/13/15 4:45    0.83396195  
1/13/15 5:00    1.466816057

CSV_2有来自gps点的常规时间,例如

CSV_2 has regular times from gps points e.g.

id      time            lat         lng
513620  1/13/15 4:31    -8.15949    118.26005
513667  1/13/15 4:36    -8.15215    118.25847
513668  1/13/15 5:01    -8.15211    118.25847

我试图遍历这两个文件以查找CSV_2中的时间在15分钟的时间组在CSV_1然后做一些事情。在这种情况下,将ave_speed附加到此条件为真的每个条目。

I'm trying to iterate through both files to find instances where time in CSV_2 is found within the 15 min time group in CSV_1 and then do something. In this case append ave_speed to every entry which this condition is true.

使用上述示例的所需结果:

Desired result using the above examples:

id      time            lat         lng           ave_speed
513620  1/13/15 4:31    -8.15949    118.26005     0.83396195
513667  1/13/15 4:36    -8.15215    118.25847     0.83396195
513668  1/13/15 5:01    -8.15211    118.25847     something else

我试图单纯在pandas dataframes做,但遇到了一些麻烦我认为这可能是一个解决方案来实现我的后。

I tried doing it solely in pandas dataframes but ran into some troubles I thought this might be a workaround to achieve what i'm after.

这是我写到目前为止的代码,我觉得它接近,但我似乎无法确定的逻辑,以获得我的for循环返回条目15分钟时间组。

This is the code i've written so far and I feel like it's close but I can't seem to nail the logic to get my for loop returning entries within the 15 min time group.

with open('path/CSV_2.csv', mode="rU") as infile:
with open('path/CSV_1.csv', mode="rU") as newinfile:
    reader = csv.reader(infile)
    nreader = csv.reader(newinfile)
    next(nreader, None)  # skip the headers
    next(reader, None)  # skip the headers

    for row in nreader:
        for dfrow in reader:
            if (datetime.datetime.strptime(dfrow[2],'%Y-%m-%d %H:%M:%S') < datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S') and
            datetime.datetime.strptime(dfrow[2],'%Y-%m-%d %H:%M:%S') > datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S') - datetime.timedelta(minutes=15)):
                print dfrow[2]

链接到pandas问题我发布了相同的问题 Pandas,检查重采样时间戳值是否存在30分钟时间bin的datetimeindex

Link to pandas question I posted with same problem Pandas, check if timestamp value exists in resampled 30 min time bin of datetimeindex

编辑:
创建两个时间列表,即 listOne 所有时间从CSV_1和 listTwo 与CSV_2中的所有时间我可以找到时间组中的实例。所以使用CSV值很奇怪。任何帮助将不胜感激。

Creating two lists of time, i.e. listOne with all the times from CSV_1 and listTwo with all the times in CSV_2 I'm able to find instances in the time groups. So something is weird with using CSV values. Any help would be appreciated.

推荐答案

我觉得这非常接近我想要的如果任何人都好奇的如何做一样的东西。它不是非常高效,并且当前脚本花费大约1天多次迭代所有行,因为双循环。

I feel like this is pretty close to what I want if anyone is curious on how to do the same thing. It's not massively efficient and the current script takes roughly 1 day to iterate over all the rows multiple times because of the double loop.

如果任何人有任何想法,如何使

If anyone has any thoughts on how to make this easier or quicker i'd be very interested.

#OPEN THE CSV FILES
with open('/GPS_Timepoints.csv', mode="rU") as infile:
with open('/Resampled.csv', mode="rU") as newinfile:
    reader = csv.reader(infile)
    nreader = csv.reader(newinfile)
    next(nreader, None)  # skip the headers
    next(reader, None)  # skip the headers

    #DICT COMPREHENSION TO GET ONLY THE DESIRED DATA FROM CSV              
    checkDates = {row[0] : row[7] for row in nreader }
    x = checkDates.items()

    # READ CSV INTO LIST (SEEMED TO BE EASIER THAN READING DIRECT FROM CSV FILE, I DON'T KNOW IF IT'S FASTER)
    csvDates = []
    for row in reader:
        csvDates.append(row)

    #LOOP 1 TO ITERATE OVER FULL RANGE OF DATES IN RESAMPLED DATA AND A PRINT STATEMENT TO GIVE ME HOPE THE PROGRAM IS RUNNING
    for i in range(0,len(x)):
        print 'checking', i
        #TEST TO SEE IF THE TIME IS IN THE TIME RANGE, THEN IF TRUE INSERT THE DESIRED ATTRIBUTE, IN THIS CASE SPEED TO THE ROW 
        for row in csvDates:
            if row[2] > x[i-1][0] and row[2] < x[i][0]:
                row.insert(9,x[i][1])

    # GET THE RESULT TO CSV TO UPLOAD INTO GIS
    with open('/result.csv', mode="w") as outfile:

        wr = csv.writer(outfile)
        wr.writerow(['id','boat_id','time','state','lat','lng','activity','speed', 'state_reason'])

        for row in csvDates:
            wr.writerow(row)

这篇关于在多个datetime组python中查找datetime实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆