在Matlab中嵌套双重排序 [英] Nested double sort in Matlab

查看:131
本文介绍了在Matlab中嵌套双重排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有3个向量,向量A(n x 1),向量B(n x 1)和向量C(n x 1).

Suppose I have 3 vectors, vector A which is (n x 1), vector B which is (n x 1) and vector C which is (n x 1).

我想将A的元素分为5组,然后在这些组中,我也想将B的各个元素也分为5组.然后取C中元素的平均值.所以我将有25个平均值.

I want to sort the elements of A, into 5 groups, and then within those groups I want to sort the respective elements of B into 5 groups as well. And then take the average of the elements in C. So I will have 25 averages.

换句话说:

  1. A的元素排序为5个五分位数;
  2. 选择第一个 A中的一组元素,获取B中的相应值;
  3. 将选择的B元素分为5组.
  4. C中每个组的平均值.
  5. 选择A中的第二组元素,得到对应的 B;
  6. 中的值
  7. 将选择的B元素分为5组.
  8. C中每个组的平均值.
  9. 依此类推.
  1. Sort the elements of A into 5 quintiles;
  2. Pick the first group of elements in A, get the corresponding values in B;
  3. Sort the picked elements of B into 5 groups.
  4. Take the average of each group from C.
  5. Pick the second group of elements in A, get the corresponding values in B;
  6. Sort the picked elements of B into 5 groups.
  7. Take the average of each group from C.
  8. And so on and so forth.

这是我的虚拟代码:

minimum = 50;
maximum = 100;

A = (maximum-minimum).*rand(1000,1) + minimum;
B = (maximum-minimum).*rand(1000,1) + minimum;
C = (maximum-minimum).*rand(1000,1) + minimum;


nbins1 = 5; 
nbins2 = 5;

bins1 = ceil(nbins1 * tiedrank(A) / length(A));

for i=1:nbins1

    B1 = B(bins1==i);
    C1 = C(bins1==i);
    bins2 = ceil(nbins1 * tiedrank(B1) / length(B1));

    for j=1:nbins2
        C2 = C1(bins2==j);
        output(i,j) = mean(C2);
        clearvars  C2 
    end


    clearvars B1 C1
end

问题在于,这似乎一点都不优雅或高效.还有其他方法吗?对于金融界人士来说,这个问题类似于Fama-French(1993)对投资组合的双重排序.

The issue is that, this does not seem very elegant or efficient at all. Is there any other way of doing this? For people in Finance, this problem is analogous to the Fama-French (1993) double sorting of portfolios.

推荐答案

首先,按A列对所有内容进行排序:

First of all, sort everything by column A:

sortedByA = sortrows([A,B,C], 1);

创建一个虚拟向量,表示A中每个组的索引(从1nbins1):

Create a dummy vector representing indices of each group in A (from 1 to nbins1):

groupsA = repmat(1:nbins1, 1000/nbins1, 1); groupsA = groupsA(:);

然后再次重新排序(按前两列),但是用组索引替换实际的列A,这实际上将对A中每组值中的B进行排序:

Then re-sort again (by first two columns), but replacing actual column A with group indices, which would in effect sort B within each group of values in A:

sorted = sortrows([groupsA, sortedByA(:,[2,3])], [1,2]);

为C列中的组创建索引(从1nbins1*nbins2):

Create indices for groups in column C (from 1 to nbins1*nbins2):

groupsC = repmat(1:(nbins1*nbins2), 1000/(nbins1*nbins2), 1); groupsC = groupsC(:);

最后,计算每组中的均值:

Finally, compute mean within each group:

averages = accumarray(groupsC, sorted(:,3), [], @mean);

这篇关于在Matlab中嵌套双重排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆