如何处理python大 pandas 这个复杂的逻辑？ [英] How to deal with this complex logic in python pandas？

查看：294 发布时间：2017/3/26 0:21:45 python-2.7 pandas dataframe

本文介绍了如何处理python大 pandas 这个复杂的逻辑？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些数据如下面的结构。它用于python大熊猫数据框架，我命名为df。

  Data1，Data2，Flag 
 2016-04- 29,00：40：15,1 
 2016-04-29,00：40：24,2 
 2016-04-29,00：40：35,2 
 2015-04 -29,00：40：36,2 
 2015-04-29,00：40：43 
 2015-04-29,00：40：45,2 
 2015- 04-29,00：40：55,1 
 2015-04-29,00：41：05,1 
 2015-04-29,00：41：16,1 
 2015 -04-29,00：41：17,2 
 ..................... 
 ......... ............ 
 2016-11-29,11：52：36,2 
 2016-11-29,11：52：43 
 2016-11-29,11：52：45,2 
 2016-11-29,11：52：55,1

我想获取数据符合以下要求。

如您所知，第一个数据的时间是 2016-04-29,00：40：15 。我想得到这个数据帧中的下一个数据大于引物的数据18秒。
我会得到第二个数据： 2016-04-29,00：40：35,2
第三个数据是： 2015-04-29,00：40：55,1

如果下一个数据的标志与引用的数据不同，我会得到这个数据，无论是否已经过了18秒。

对于上述两个要求，我将获得以下数据： p>

  Data1，Data2，Flag 
 2016-04-29,00：40：15,1 
 2016- 04-29,00：40：24,2 
 2015-04-29,00：40：43 
 2015-04-29,00：40：55,1 
 2015 -04-29,00：41：16,1 
 2015-04-29,00：41：17,2 
 ................. ....

解决方案

参考

计时

因为@Kartik坚持： - ）

I have some data like follow structure. It used in python pandas Data Frame and I named it df.

Data1,Data2,Flag
2016-04-29,00:40:15,1
2016-04-29,00:40:24,2
2016-04-29,00:40:35,2
2015-04-29,00:40:36,2
2015-04-29,00:40:43,2
2015-04-29,00:40:45,2
2015-04-29,00:40:55,1
2015-04-29,00:41:05,1
2015-04-29,00:41:16,1
2015-04-29,00:41:17,2
.....................
.....................
2016-11-29,11:52:36,2
2016-11-29,11:52:43,2
2016-11-29,11:52:45,2
2016-11-29,11:52:55,1

I want to get the data meet the following requirements.

As you know the first data's timeseries is 2016-04-29,00:40:15. I want to get the next data in this dataframe larger than primer's data 18 secs. I'll get the second data : 2016-04-29,00:40:35,2 The third data is: 2015-04-29,00:40:55,1
If the next data's flag is different from the primer's data.I will get this data regardless of whether it has passed 18 secs.

For the above two requirements, I 'll get the data as following:

Data1,Data2,Flag
2016-04-29,00:40:15,1
2016-04-29,00:40:24,2
2015-04-29,00:40:43,2
2015-04-29,00:40:55,1
2015-04-29,00:41:16,1
2015-04-29,00:41:17,2
.....................

解决方案

refer to stackoverflow documentation

I built a generator to produce the rows then used pd.concat

def get_row(df):
    ref = None
    for i, row in df.iterrows():
        if ref is not None:
            cond1 = (row.Data2.total_seconds() - 
                     ref.Data2.total_seconds() > 18)
            cond2 = row.Flag != ref.Flag
        if ref is None or cond1 or cond2:
            yield row
            ref = row

pd.concat([r for r in get_row(df)], axis=1).T

Timing

Because @Kartik insisted :-)

这篇关于如何处理python大 pandas 这个复杂的逻辑？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何处理python大 pandas 这个复杂的逻辑？ [英] How to deal with this complex logic in python pandas？

问题描述

计时

Timing

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何处理python大 pandas 这个复杂的逻辑？ [英] How to deal with this complex logic in python pandas？

问题描述

计时

Timing

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭