如何推广这个算法(符号模式匹配计数器)? [英] How to generalize this algorithm (sign pattern match counter)?

查看:181
本文介绍了如何推广这个算法(符号模式匹配计数器)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个code在R:

I have this code in R :

corr = function(x, y) {
    sx = sign(x)
    sy = sign(y)

    cond_a = sx == sy && sx > 0 && sy >0
    cond_b = sx < sy && sx < 0 && sy >0
    cond_c = sx > sy && sx > 0 && sy <0
    cond_d = sx == sy && sx < 0 && sy < 0
    cond_e = sx == 0 || sy == 0

    if(cond_a) return('a')
    else if(cond_b) return('b')
    else if(cond_c) return('c')
    else if(cond_d) return('d')
    else if(cond_e) return('e')
}

它的作用是用来与R中的 mapply 功能相结合,以计算所有可能的标志图案present时间序列。在这种情况下,图案具有2的长度和所有可能的元组是:(+,+)(+, - )( - ,+)( - , - )

Its role is to be used in conjunction with the mapply function in R in order to count all the possible sign patterns present in a time series. In this case the pattern has a length of 2 and all the possible tuples are : (+,+)(+,-)(-,+)(-,-)

我用的是科尔的功能是这样的:

I use the corr function this way :

> with(dt['AAPL'], table(mapply(corr, Return[-1], Return[-length(Return)])) /length(Return)*100)

         a          b          c          d          e 
24.6129416 25.4466058 25.4863041 24.0174672  0.3969829 

> dt["AAPL",list(date, Return)]
      symbol       date     Return
   1:   AAPL 2014-08-29 -0.3499903
   2:   AAPL 2014-08-28  0.6496702
   3:   AAPL 2014-08-27  1.0987923
   4:   AAPL 2014-08-26 -0.5235654
   5:   AAPL 2014-08-25 -0.2456037

我想概括科尔函数 N 参数。这意味着,每 N 我会记下所有对应于所有可能的n元组的条件。目前我能想到的这样做的最好的事情就是让一个python脚本编写使用循环的code字符串,但必须有一个方法来正确地做到这一点。您对我怎么能概括的挑剔条件写一个想法,也许我可以尝试使用 expand.grid 但怎么做匹配呢?

I would like to generalize the corr function to n arguments. This mean that for every nI would have to write down all the conditions corresponding to all the possible n-tuples. Currently the best thing I can think of for doing that is to make a python script to write the code string using loops, but there must be a way to do this properly. Do you have an idea about how I could generalize the fastidious condition writing, maybe I could try to use expand.grid but how do the matching then ?

推荐答案

我觉得你最好使用 rollapply(...)动物园包这一点。既然你似乎可以用 quantmod 反正(其中负荷 XTS 动物园),在这里是不使用所有这些嵌套如果(...)语句的解决方案。

I think you're better off using rollapply(...) in the zoo package for this. Since you seem to be using quantmod anyway (which loads xts and zoo), here is a solution that does not use all those nested if(...) statements.

library(quantmod)
AAPL    <- getSymbols("AAPL",auto.assign=FALSE)
AAPL    <- AAPL["2007-08::2009-03"]    # AAPL during the crash...
Returns <- dailyReturn(AAPL)

get.patterns <- function(ret,n) {
  f <- function(x) {  # identifies which row of `patterns` matches sign(x)
    which(apply(patterns,1,function(row)all(row==sign(x))))
  }
  returns  <- na.omit(ret)
  patterns <- expand.grid(rep(list(c(-1,1)),n))
  labels   <- apply(patterns,1,function(row) paste0("(",paste(row,collapse=","),")"))
  result   <- rollapply(returns,width=n,f,align="left")
  data.frame(100*table(labels[result])/(length(returns)-(n-1)))
}
get.patterns(Returns,n=2)
#      Var1     Freq
# 1 (-1,-1) 22.67303
# 2  (-1,1) 26.49165
# 3  (1,-1) 26.73031
# 4   (1,1) 23.15036

get.patterns(Returns,n=3)
#         Var1      Freq
# 1 (-1,-1,-1)  9.090909
# 2  (-1,-1,1) 13.397129
# 3  (-1,1,-1) 14.593301
# 4   (-1,1,1) 11.722488
# 5  (1,-1,-1) 13.636364
# 6   (1,-1,1) 13.157895
# 7   (1,1,-1) 12.200957
# 8    (1,1,1) 10.765550

基本的想法是创建一个模式矩阵 2的n次方行n列,每行重的可能的模式presents酮(E,G,(1,1),(-1,1),等等)。然后通过正明智使用 rollapply(...)来此功能的日收益,并确定其中的模式行比赛号(X)完全吻合。然后用行号的该向量的一个索引标签,其中包含一个字符重的模式presentation,然后用表(。 ..)像你一样。

The basic idea is to create a patterns matrix with 2^n rows and n columns, where each row represents one of the possible patterns (e,g, (1,1), (-1,1), etc.). Then pass the daily returns to this function n-wise using rollapply(...) and identify which row in patterns matches sign(x) exactly. Then use this vector of row numbers an an index into labels, which contains a character representation of the patterns, then use table(...) as you did.

这是一般的,对于n天的模式,但它忽略情况下,任何的回报是完全为零,因此 $频率列加起来还不到100。你可以看到,这不会经常发生。

This is general for an n-day pattern, but it ignores situations where any return is exactly zero, so the $Freq columns do not add up to 100. As you can see, this doesn't happen very often.

有趣的是,甚至崩溃是(非常轻微)更可能有两个向上天连续,超过两落天期间。如果你看看剧情(CL(AAPL))在此期间,你可以看到,这是一个pretty的疯狂之旅。

It's interesting that even during the crash it was (very slightly) more likely to have two up days in succession, than two down days. If you look at plot(Cl(AAPL)) during this period, you can see that it was a pretty wild ride.

这篇关于如何推广这个算法(符号模式匹配计数器)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆