将每日 pandas 数据框转换为分钟频率会错误地填充日期 [英] Conversion of Daily pandas dataframe to minute frequency incorrectly fills dates

查看:69
本文介绍了将每日 pandas 数据框转换为分钟频率会错误地填充日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将每日频率数据帧转换为分钟数据,即对于每一行,我希望在分钟的基础上重复行情自动收录器和日期的组合,并且在上一篇文章中(将Daily pandas数据帧转换为分钟频率),建议我使用ffil方法下方,但此方法将前几行的行情正确地填充到第二天。如下所示:

I am trying to convert a daily frequency dataframe to minute data, that is for each row, I want to have that combination of ticker and date repeated on minute basis, and in a previous post (Conversion of Daily pandas dataframe to minute frequency) it was suggested to me to use the ffil method below but this approach incorrectly forward fills individual rows for certain tickers to the next day. This is illustrated below:

因此下面的数据框应该被转换并且可以工作,因为日期是连续的:

So the below dataframe is supposed to be converted and it works because the dates are consecutively:

import pandas as pd
dict1 = [
        {'ticker':'jpm','date': '2016-11-27','returns': 0.2},
{'ticker':'ge','date': '2016-11-28','returns': 0.2},
{'ticker':'amzn','date': '2016-11-29','returns': 0.2}
]
df1= pd.DataFrame(dict1)
df1['date']      = pd.to_datetime(df1['date'])
df1=df1.set_index(['date','ticker'], drop=True)  

df_min1 = df1.unstack().asfreq('Min', method='ffill').between_time('13:30','13:32').stack()

在df2下面跳过1天,然后在结果数据框df_min2中,第一个股票在原始跳过的日期重复出现:

Below df2 skips 1 day, and then in the outcome dataframe df_min2, the first ticker gets repeated in the originally skipped date:

dict2 = [
        {'ticker':'jpm','date': '2016-11-27','returns': 0.2},
{'ticker':'ge','date': '2016-11-29','returns': 0.2},
{'ticker':'amzn','date': '2016-11-30','returns': 0.2}
]
df2 = pd.DataFrame(dict2)
df2['date']      = pd.to_datetime(df2['date'])
df2=df2.set_index(['date','ticker'], drop=True)  

df_min2 = df2.unstack().asfreq('Min', method='ffill').between_time('13:30','13:32').stack()

有人可以建议解决方案吗?

Can anyone suggest a solution?

推荐答案

所以下面的解决方案对我有用,我只需创建一个包含每日日期和日期的新列转换后,我创建了另一个每日列,并且仅保留两个都匹配的行:

So the solution below works for me, I simply create new column with the daily dates and after the conversion, I creaete a another daily column and only keep the rows where both match:

  df['date_column']=pd.to_datetime(df.index.get_level_values(0))
  df['date_column']=pd.to_datetime(df['date_column']).dt.date

...converting dataframe...


  df['date_column2']=pd.to_datetime(df.index.get_level_values(0))
  df['date_column2']=pd.to_datetime(df['date_column2']).dt.date
  df=df[df['date_column']==df['date_column2']]

这篇关于将每日 pandas 数据框转换为分钟频率会错误地填充日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆