自动计算变量之间的滞后互相关 [英] Automating Calculation of Lagged Cross Correlations between Variables

查看:1074
本文介绍了自动计算变量之间的滞后互相关的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,


首先为长篇文章道歉。希望有人可以提供一些建议。


我有大约200列时间序列数据,我需要根据所有变量之间的滞后交叉相关性来执行相关性分析。


目前,数据存储在Excel中。变量1在列A中,变量2在列B等中,数据在列GR中结束,其中包含最后一个变量。我需要计算A列中变量1与其他剩余列中所有其他变量之间的互相关。这里的关键是我需要计算每个变量之间的滞后和相似的相关性。


然后我想对B列中的下一个变量做同样的事情,即计算变量2和所有其他变量之间的现象和滞后相关性。


无论如何都要自动执行此操作,以便计算每个变量相对于所有其他变量的10个滞后的相关性,并将结果粘贴到某个可读的网格中每个变量的excel表格:

例如,对于A列 - 变量1,结果可以像这样显示(对于前五个变量):



BCDEF

Lag0 XXXXX

Lag1 XXXXX

Lag2 XXXXX

Lag3 XXXXX

Lag4 XXXXX

Lag5 XXXXX

Lag6 XXXXX

Lag7 XXXXX

Lag8 XXXXX

Lag9 XXXXX

Lag10 XXXXX


其中X代表变量1和所有其他变量之间的相关系数( col B,C,D,E,F到Col GR)处于不同的滞后。但输出并不一定非常像。


我可以使用Correl函数在excel中手动执行此操作,每次我想计算滞后相关时,调整公式中的系列范围。


背景:该函数的相关性有两个参数:


Array1,Array2


所以计算A列与我输入的所有其他列的相关性:


correl(A2:A100,B2:B100)给出Col A和Col B之间的相关关系,correl( A2:A100,c2:c100)给出col C与Col A等的相关性。


然后计算1 Lag的相关性我将公式改为(A3:A100 ,B2:B99),在滞后2处,公式变为(A4:A100,B2:B98)等。然后我对所有其他列/ Lags执行此操作,但为200个变量执行此操作,将花费大量时间(可能好几天了!)有没有办法使用VBA呢?即使花了几个小时就可以了,因为我可以让它在后台运行。


我想这样做的方法是使用以下表格设置一个工作簿:


1)源表 - 这个将包含所有原始数据(第1-200列),如帖子顶部所述。


2)计算表 - 这里我将复制上面的表格,但是一个额外的列(比如列A),因此数据集将在第201列结束)。在这个矩阵下面,我将使用Excel的相关函数设置公式,以计算出第1列中变量与所有其他变量之间的滞后相关性,即十个滞后。第一列有效地是变量列,而其他列中的所有数据都是固定的。然后我可以从源表复制每个变量并粘贴到第1列,处理相关性然后移动到下一个变量


3)输出表 - 这将存储结果每个变量的相关性(有效地复制我前面描述的输出网格)。


这是做我需要的最好方法吗?如果是这样,编写代码有多容易?基本上我需要从源表中获取第一个变量的代码,将其粘贴到计算表中的A列中。然后从相关网格中复制值并将其粘贴到输出表格中。


然后,它需要重复源表中下一列的过程,并将值粘贴到输出表中前一列的结果下面。我猜我需要某种循环函数来遍历源表中的每个变量并应用代码。


任何帮助非常感谢!


谢谢


卢卡斯

Hi there,

First of all apologies for the long post. Hope someone can offer some advice.

I have about 200 columns of time series data that I need to perform a correlation analysis on in terms calculating lagged cross correlations between all the variables.

Currently the data is stored in Excel. Variable 1 is in Column A, variable 2 in Column B etc, the data ends in column GR which contains the last variable. I need to calculate the cross correlations between variable 1 in Column A and all other variables in the other remaining columns. The key thing here is that I need to calculate lagged as well as contemperaous correlations between each of the variables.

Then I want to do the same for the next variable in Column B, i.e. calculate contemperaous and lagged correlations between variable 2 and all other variables.

Is there anyway to automate this so that it work out the correlations up to say 10 lags of for each variable against all other variables and paste the results in some kind of readable grid in an excel sheet for each variable:

For example, for Column A - Variable 1 the results could be shown like this (for the first five variables):


B C D E F
Lag0 X X X X X
Lag1 X X X X X
Lag2 X X X X X
Lag3 X X X X X
Lag4 X X X X X
Lag5 X X X X X
Lag6 X X X X X
Lag7 X X X X X
Lag8 X X X X X
Lag9 X X X X X
Lag10 X X X X X


Where X represents the correlation coefficient between variable 1 and all the other variables (col B, C, D, E, F to Col GR) at different lags. The output doesn''t have to be exactly like this though.

I can do this manually in excel using the Correl function, adjusting the series range in the formula every time I want to work out the lagged correlation.

Background: The correlation the function takes two arguments:

Array1, Array2

So to calculate correlations for Column A against all other columns I would input:

correl (A2:A100, B2:B100) to give the contemperous correlation between Col A and Col B, correl (A2:A100, c2:c100) to give correlation for col C vs. Col A etc.

Then to work out the correlation at 1 Lag I change the formula to (A3:A100, B2:B99), at lag 2 the formula becomes (A4:A100, B2:B98) etc. Then I do this for all the other columns/Lags but to do this for 200 variables, would take a huge amount of time (probably several days)! Is there a way of doing it using VBA perhaps? Even if it takes a few hours that''s OK, as I can leave it to run in the background.

The way I''m thinking of doing this is setting up a workbook with the following sheets:

1) Source Sheet - this would contain all the raw data (Columns 1-200)as described in the top of the post.

2) Calculation sheet - Here I would have a duplicate of the above sheet but with one extra column (say column A) and so the dataset would end in column 201). Beneath this matrix I would set up formulas using the Excel?s correl function to work out the lagged correlation between the variable in column 1 and all other variables to say ten lags. Effectively column one would be variable column whereas all the data in the other columns would be fixed. I could then copy each variable from the source sheet and paste into column 1, work the correlations and then move to the next variable

3) Output Sheet - this would store the results of the correlations for each variable (effectively replicating the output grid I described earlier).

Would this be the best way of doing what I need? If so how easy is it to write the code? Basically I need code that will take the first variable from the source sheet, paste it into column A in the calculation sheet. Then copy the values from the correlation grid and paste it into the output sheet.

Then it needs to repeat the process for next column in the source sheet, and paste the values beneath the results for the previous column in the output sheet. I''m guess I need some kind of loop function that will loop through each variable in the source sheet and apply the code.

Any help much appreciated!

Thanks

Lucas

推荐答案

好的,我认为这是一个机会,但只有一个小的VB可以使它更快,它取决于你正在使用的算法。


首先我建议你在互联网上找到一个很好的快速算法(只是谷歌吧)

然后,不要在excel上工作,VBA写入数据excel非常慢,只需将数据放入数组中,使用数组进行处理,然后将结果放入另一张表格中。


将excel工作表添加到数组中的简单方法是喜欢:

dim a

带工作表(1)

a =范围(.cells(1,1),。cells(100,200))


现在,你必须为你想要的每个相关制作一些DO,对于滞后的相关,我认为这可能需要一些时间,但如果你没找到一个更快的方法,我很乐意帮助你(看起来有点好玩)
Ok, i think there''s a chance, but only a small one that VB can make it a lot faster, it depends on the algorithm you''re using.

First i recomend you to find a nice quick algorithm in the internet (just google it)
then, dont work it on excel, VBA writes data on excel very slow, just put the data into an array, work it with arrays and then throw the results in another sheet.

the fastes way to add your excel worksheet to an array is something like:
dim a
with worksheets(1)
a = range(.cells(1,1), .cells(100,200))

Now, you''ll have to make some DO for each correlation you want, and for the lagged ones, I think this could take some time as well, but if you dont find a faster method, i will be glad to help you (it seems somekind of fun)


不要介意我。只是注册了对这个主题的兴趣,因为我很想知道它在哪里。
Don''t mind me. Just registering an interest in this thread, as I''m curious to see where it goes.





感谢您的建议。


当你说从谷歌下载一个快速算法时 - 你的意思是只计算一个相关性或更多迭代的东西,它也可以用于所有滞后的相关性吗?


是不是只能使用Excel的correl函数,或者你是说使用该函数的原始VBA代码会更快?


我确定我可以获得一些VBA代码,可以让你计算两个变量之间的相关性,如果你认为这会加快这个过程。


很抱歉看似天真的问题,但我不是程序员,只是一个使用工具进行分析的分析师!
Hi,

Thanks for your advice.

When you say download a quick algorithm from Google - do you mean to work out just a correlation or something more iterative that will work all the lagged correlations as well?

Is it not possible just to use Excel''s correl function or are you saying that it would be faster to have the raw VBA code for that function?

I''m sure I can get hold of some VBA code that lets you calculate the correlation between two variables, if you think that will speed up the process.

Sorry for what may seem like naive questions but I''m not a programmer, just an analyst that uses tools to do analysis!


这篇关于自动计算变量之间的滞后互相关的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆