KDB查询性能改进 [英] KDB query performance improvement

查看:77
本文介绍了KDB查询性能改进的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的表格,其中包含我用于股票算法回溯测试的价格.

I have a simple table containing prices that I'm using for stock algo back testing.

price_hist:([pxkey:`$()]price:`float$())
update `g#pxkey from `price_hist

pxkey是格式为'MSFT_5M_201710060945'的连接字符串,因此,股票= MSFT,价格栏间隔= 5分钟,日期时间= 201710060945.我使用串联的字符串而不是单个列,因为它很简单,而且我是KDB的新手,我想让某些事情快速运行.

pxkey is a concatenated string in the format 'MSFT_5M_201710060945', so stock=MSFT, price bar intervals=5 mins and datetime=201710060945. I used the concatenated string instead of individual columns because it's simple and I'm a KDB novice and I wanted to get something running quickly.

我大约有500万行,使用完全相同的数据,性能仅比MySql快一点.关于如何改善这一点的任何想法(通过表结构,属性,查询等等)?仅供参考,我正在将C#与qSharp库一起使用,并且要查询的我使用的是返回字典的这种格式:-

I have about 5 million rows in there and the performance is only marginally faster than MySql using the exact same data. Any ideas on how to improve this (either thru table structure, attributes, query, anything..)? FYI I'm using C# with qSharp library and to query i'm using this format which returns a dictionary:-

price_hist`MSFT_5M_201710060945

推荐答案

在kdb +中,创建数百万个生成的符号绝不是一个好主意.我建议使用键控表而不是字典:

Creating millions of generated symbols is never a good idea in kdb+. I would recommend using a keyed table instead of a dictionary:

bar5m:([sym:`$();time:`timestamp$()]price:`float$())

填充后,您应该可以按以下方式查询

Once you populate it, you should be able to query it as follows

bar5m[(`MSFT;2017.10.06D09:45);`price]

为提高性能,请确保该表按sym,time排序,并将p属性放在sym上.

To improve the performance, make sure the table is sorted by sym,time and put the p attribute on sym.

这篇关于KDB查询性能改进的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆