像asof一样的KDB +可以在 pandas 中获取时间序列数据吗? [英] KDB+ like asof join for timeseries data in pandas?

查看:199
本文介绍了像asof一样的KDB +可以在 pandas 中获取时间序列数据吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

kdb +具有 aj 函数,通常用于沿时间列联接表.

kdb+ has an aj function that is usually used to join tables along time columns.

在此示例中,我有交易表和报价表,并获得每笔交易的通用报价.

Here is an example where I have trade and quote tables and I get the prevailing quote for every trade.

q)5# t
time         sym  price size 
-----------------------------
09:30:00.439 NVDA 13.42 60511
09:30:00.439 NVDA 13.42 60511
09:30:02.332 NVDA 13.42 100  
09:30:02.332 NVDA 13.42 100  
09:30:02.333 NVDA 13.41 100  

q)5# q
time         sym  bid   ask   bsize asize
-----------------------------------------
09:30:00.026 NVDA 13.34 13.44 3     16   
09:30:00.043 NVDA 13.34 13.44 3     17   
09:30:00.121 NVDA 13.36 13.65 1     10   
09:30:00.386 NVDA 13.36 13.52 21    1    
09:30:00.440 NVDA 13.4  13.44 15    17

q)5# aj[`time; t; q]
time         sym  price size  bid   ask   bsize asize
-----------------------------------------------------
09:30:00.439 NVDA 13.42 60511 13.36 13.52 21    1    
09:30:00.439 NVDA 13.42 60511 13.36 13.52 21    1    
09:30:02.332 NVDA 13.42 100   13.34 13.61 1     1    
09:30:02.332 NVDA 13.42 100   13.34 13.61 1     1    
09:30:02.333 NVDA 13.41 100   13.34 13.51 1     1  

如何使用熊猫进行相同的操作?我正在使用索引为datetime64的交易和报价数据框.

How can I do the same operation using pandas? I am working with trade and quote dataframes where the index is datetime64.

In [55]: quotes.head()
Out[55]: 
                              bid    ask  bsize  asize
2012-09-06 09:30:00.026000  13.34  13.44      3     16
2012-09-06 09:30:00.043000  13.34  13.44      3     17
2012-09-06 09:30:00.121000  13.36  13.65      1     10
2012-09-06 09:30:00.386000  13.36  13.52     21      1
2012-09-06 09:30:00.440000  13.40  13.44     15     17

In [56]: trades.head()
Out[56]: 
                            price   size
2012-09-06 09:30:00.439000  13.42  60511
2012-09-06 09:30:00.439000  13.42  60511
2012-09-06 09:30:02.332000  13.42    100
2012-09-06 09:30:02.332000  13.42    100
2012-09-06 09:30:02.333000  13.41    100

我看到熊猫具有asof函数,但未在DataFrame上定义,仅在Series对象上定义.我猜一个人可以遍历每个系列并一个一个地对齐它们,但是我想知道是否有更好的方法?

I see that pandas has an asof function but that is not defined on the DataFrame, only on the Series object. I guess one could loop through each of the Series and align them one by one, but I am wondering if there is a better way?

推荐答案

正如您在问题中提到的那样,遍历每一列都应该对您有用:

As you mentioned in the question, looping through each column should work for you:

df1.apply(lambda x: x.asof(df2.index))

我们有可能创建一个更快的NaN天真版本的DataFrame.asof,一次完成所有列.但就目前而言,我认为这是最直接的方法.

We could potentially create a faster NaN-naive version of DataFrame.asof to do all the columns in one shot. But for now, I think this is the most straightforward way.

这篇关于像asof一样的KDB +可以在 pandas 中获取时间序列数据吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆