看以前的时间序列 [英] Looking at Previous Time series

查看:76
本文介绍了看以前的时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,如下所示.想法是每隔15分钟查看一次,而不是查看我们在分组功能中使用的频率.我想查看过去15分钟内发生的积极变化.

I Have a dataset as shown below. The idea is looking at every previous 15minutes not the frequency which we use in grouper function. I want to see the number of positive changes in the previous 15 minutes.

row      Timestamp     Direction     Positive      Neg        Nut

 1    1/20/19 12:15    
 2    1/20/19 12:17    Nut
 3    1/20/19 12:17    Neg
 4    1/20/19 12:18    Neg
 5    1/20/19 12:19    Pos
 6    1/20/19 12:20    Neg
 7    1/20/19 12:21    Neg
 8    1/20/19 12:22    Pos
 9    1/20/19 12:23    Neg
 10   1/20/19 12:24    Pos
 11   1/20/19 12:25    Neg
 12   1/20/19 12:26    Neg
 13   1/20/19 12:27    Neg
 14   1/20/19 12:29    Neg
 15   1/20/19 12:29    Nut
 16   1/20/19 12:30    Pos           4(o2:o16)      9         2
 17   1/20/19 12:31    Nut           4(o3:o17)      9         3
 18   1/20/19 12:32    Pos           5(o4:o18)      9         2

所以我在Excel中执行= COUNTIF(Direction2:Direction16,"Pos")以计算正"列.我不确定如何以Python方式进行操作.当我尝试应用相同的公式时,我最终分组15分钟,这不是我想要的.对于每分钟,我都会检查Excel中的前15分钟.有人可以让我知道我需要遵循的方法.因此,目标是获得正",负"和中性"列.给定的是时间戳记和方向"列

So i am doing =COUNTIF(Direction2:Direction16,"Pos") in excel to calculate the Positive column. I am not sure how to do in Pythonic way. When i tried to apply same formula i end up grouping 15minutes which is not what i wanted. For every minute i check previous 15minutes in excel. Could someone please let me know the approach i need to follow. So the goal is to get Positive, Negative and Neutral columns. Given is the Timestamp and Direction Column

错误:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)

    /usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
       3062             try:
    -> 3063                 return self._engine.get_loc(key)
       3064             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'timestamp'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-87-d00f59bea382> in <module>()
      2 #df['timestamp'] = pd.to_datetime(df.timestamp)
      3 #df = df.set_index('timestamp')
----> 4 df['timestamp'] = pd.to_datetime(df['timestamp'])
      5 df = df.set_index('timestamp')
      6 

/usr/local/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2683             return self._getitem_multilevel(key)
   2684         else:
-> 2685             return self._getitem_column(key)
   2686 
   2687     def _getitem_column(self, key):

/usr/local/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
   2690         # get column
   2691         if self.columns.is_unique:
-> 2692             return self._get_item_cache(key)
   2693 
   2694         # duplicate columns & possible reduce dimensionality

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
   2484         res = cache.get(item)
   2485         if res is None:
-> 2486             values = self._data.get(item)
   2487             res = self._box_item_values(item, values)
   2488             cache[item] = res

/usr/local/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
   4113 
   4114             if not isna(item):
-> 4115                 loc = self.items.get_loc(item)
   4116             else:
   4117                 indexer = np.arange(len(self.items))[isna(self.items)]

/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3063                 return self._engine.get_loc(key)
   3064             except KeyError:
-> 3065                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   3066 
   3067         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'timestamp'

df.info()

RangeIndex: 31106 entries, 0 to 31105
Data columns (total 12 columns):
ID                31106 non-null int64
High              31106 non-null float64
Last              31106 non-null float64
Timestampvalue    31106 non-null int64
Bid               31106 non-null float64
VWap              31106 non-null float64
Volume            31106 non-null float64
Low               31106 non-null float64
Ask               31106 non-null float64
Openamt           31106 non-null float64
Type              31106 non-null object
timestamp         31106 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(8), int64(2), object(1)
memory usage: 2.8+ MB

推荐答案

您可以使用:

#create DatetimeIndex if necessary
#df = df.set_index('timestamp')

#get unique values with counts by comparing and sum of True
cols = df['Direction'].dropna().unique()
for c in cols:
    df[c] = df['Direction'].eq(c).rolling('15min').sum()

#if necessary set first 14 minutes to NaNs
df.loc[:df.index[0] + pd.Timedelta(14 * 60, unit='s'), cols] = np.nan


print (df)
                     row Direction   Positive  Neg  Nut  Pos
timestamp                                                   
2019-01-20 12:15:00    1       NaN        NaN  NaN  NaN  NaN
2019-01-20 12:17:00    2       Nut        NaN  NaN  NaN  NaN
2019-01-20 12:17:00    3       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:18:00    4       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:19:00    5       Pos        NaN  NaN  NaN  NaN
2019-01-20 12:20:00    6       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:21:00    7       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:22:00    8       Pos        NaN  NaN  NaN  NaN
2019-01-20 12:23:00    9       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:24:00   10       Pos        NaN  NaN  NaN  NaN
2019-01-20 12:25:00   11       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:26:00   12       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:27:00   13       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:29:00   14       Neg        NaN  NaN  NaN  NaN
2019-01-20 12:29:00   15       Nut        NaN  NaN  NaN  NaN
2019-01-20 12:30:00   16       Pos  4(o2:o16)  9.0  2.0  4.0
2019-01-20 12:31:00   17       Nut  4(o3:o17)  9.0  3.0  4.0
2019-01-20 12:32:00   18       Pos  5(o4:o18)  8.0  2.0  5.0

这篇关于看以前的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆