如何管理表/矩阵以使用条件获取信息 [英] How to manage a table/matrix to obtain information using conditions

查看:73
本文介绍了如何管理表/矩阵以使用条件获取信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在考虑解决这个问题的最佳方法是什么?

Ive been thinking about what is the best way to solve this, any advise?

表格/矩阵X为:

X <- read.table(text = "    a       b       c       d       e
    0       27        0       28         8
    1       14       24       32        33
    0        4       22       25        27 
    0        3        7       26        34
    0       28       33       31        21
    0       16       17       24        18
    1        3       19        0        12
    0        2       23        5        24
    2       17       22       22        10
    0       35       15       17         2", 
                  header = TRUE, stringsAsFactors = FALSE)

或使用matlab:

X =[
        0       27        0       28         8
        1       14       24       32        33
        0        4       22       25        27 
        0        3        7       26        34
        0       28       33       31        21
        0       16       17       24        18
        1        3       19        0        12
        0        2       23        5        24
        2       17       22       22        10
        0       35       15       17         2];

如何获取包含以下内容的表/矩阵A和B:

How could I obtain a table/matrix A and B which contains:

行和列每5个值累加一次,直到表X->的最大值,在这种情况下:0-5; 0-10; 0-15; 0-20 .. 0-35(对于两行)和列).

表A:

第一件事是只选择a的正值,然后计算这些条件发生的次数:

The first thing is to select only the positive values of a, and then count the number of times these conditions occur:

  1. 表/矩阵A的条目(1,1)将是 b 值介于0-5和 d 值之间的次数在0-5之间.条目(2,1)是 b 值在0-5之间和 d 值在0-10之间的次数.条目(3,1)是 b 值介于0-5之间和 d 值介于0-15之间的次数...直到行是0-35.

  1. The entry (1,1) of the table/matrix A would be the number of times the b values are among 0-5 and d values are among 0-5. The entry (2,1) would be the number of times the b values are among 0-5 and d values are among 0-10. The entry (3,1) would be the number of times the b values are among 0-5 and d values are among 0-15... this until the row is 0-35.

表/矩阵A的条目(1,2)将是 b 值在0-10和 d 值之间的次数在0-5之间.条目(2,2)是 b 值在0-10之间且 d 值在0-10之间的次数.条目(3,2)将是 b 值在0-10之间并且 d 值在0-15之间的次数...直到行是0-35.

The entry (1,2) of the table/matrix A would be the number of times the b values are among 0-10 and d values are among 0-5. The entry (2,2) would be the number of times the b values are among 0-10 and d values are among 0-10. The entry (3,2) would be the number of times the b values are among 0-10 and d values are among 0-15... this until the row is 0-35.

我们形成一个7x7的表格/矩阵,结果显示为:

   A:
          0-5     0-10     0-15     0-20     0-25     0-30     0-35
   0-5     1        1        1        1         1        1        1
   0-10    1        1        1        1         1        1        1
   0-15    1        1        1        1         1        1        1
   0-20    1        1        1        1         1        1        1
   0-25    1        1        1        2         2        2        2
   0-30    1        1        1        2         2        2        2
   0-35    1        1        2        3         3        3        3

列是b值的间隔,行是d值的间隔

表/矩阵A的含义是a值大于0并匹配b和d列中不同组合间隔(每个5个值)的次数. 例如,对于结果表A的条目(1,1)为1,因为在表X的第7行中,只有1个数字,其中a> 0,b = 3(在0-5之间)和d = 0(在0-5之间). 另一个例子是条目(7,4),由于出现了3次,所以为3:a> 0,b在0-20之间,d在0-35之间

The meaning of the table/matrix A is the number of times the a values are higher than 0 and match different combinations intervals (each 5 values) among column b and d. For example, for the entry (1,1) of result table A is 1 because in the row 7 of table X, there is only 1 number which a>0, b=3 (among 0-5) and d=0 (among 0-5). Other example is the entry (7,4), which is 3 since there three times: a>0, b is among 0-20 and d is among 0-35

表/矩阵B将与表A相同,但比较列c和e. B是:

The table/matrix B, would be the same table A but comparing columns c and e. B is:

  B:
          0-5     0-10     0-15     0-20     0-25     0-30     0-35
   0-5     1        1        1        1         1        1        1
   0-10    2        2        2        2         2        2        2
   0-15    2        2        2        2         2        2        2
   0-20    2        2        2        2         2        2        3
   0-25    2        2        2        4         4        4        5
   0-30    4        4        4        6         6        7        8
   0-35    4        4        5        7         7        9       10

推荐答案

这是我在Matlab下为您提供的解决方案(它只负责创建第一个表,但对于第二个表,代码基本上是相同的):

Here is the solution I propose you under Matlab (it only takes care of creating the first table, but for the second one the code is basically the same):

% Define the sample data...
X = [
    0       27        0       28         8
    1       14       24       32        33
    0        4       22       25        27 
    0        3        7       26        34
    0       28       33       31        21
    0       16       17       24        18
    1        3       19        0        12
    0        2       23        5        24
    2       17       22       22        10
    0       35       15       17         2
];

% Find the upper bound (as a multiple of 5) and build the ranges...
X_max = max(max(X(:,2:end)));
X_upper = round(X_max / 5) * 5;

% Create the required ranges and their respective string representation...
R = 5:5:X_upper;
R_len = numel(R);
R_str = sprintfc('0-%d',R);

% Filter the table rows keeping only those with A > 0...
X = X(X(:,1) > 0,2:end);

% Apply the criterions to B and D (first and third rows)...
b = arrayfun(@(x) sum(X(:,1) > 0 & X(:,1) <= x),R).';
d = arrayfun(@(x) sum(X(:,3) > 0 & X(:,3) <= x),R);

% Cross the results of both computations into a single matrix...
A = num2cell(repmat(b,1,R_len) + repmat(d,R_len,1));
A = [[{''}; R_str.'] [R_str; A]];

代码中的注释应该使所有内容变得不言自明,但是如果您有任何疑问,请随时询问更多详细信息.对于最终结果A的类型,我选择了一个单元格矩阵,因为表的行和列遵循严格的命名约定,但是从此输出切换到基于表的输出非常容易.例如:

The comments in the code should make everything pretty self-explanatory, but if you have any doubts feel free to ask for more details. For what concerns the type of the final result A, I opted for a matrix of cells, because table rows and columns are subject to strict naming conventions, but it's very easy to switch from this output to a table-based one... for example:

% Cross the results of both computations into a single table...
A = array2table(repmat(b,1,R_len) + repmat(d,R_len,1),'RowNames',R_str,'VariableNames',R_str);

编辑

此代码根据OP要求正确处理数据:

This code properly handles data as per OP requirements:

% Find the upper bound (as a multiple of 5) and build the ranges...
X_max = max(max(X(:,2:end)));
X_upper = round(X_max / 5) * 5;

% Create the required ranges and their respective string representation...
R = 5:5:X_upper;
R_len = numel(R);
R_str = sprintfc('R_0_%d',R);

% Adjust the range for exclusive limits and project it...
R(end) = R(end) + 1;
R_rep = [repelem(R,1,7).' repmat(R,1,7).'];

% Apply the criterions to B and D (first and third rows)...
X_a = X(X(:,1) > 0,2:end);
fun_A = arrayfun(@(b,d) sum(X_a(:,1) >= 0 & X_a(:,1) < b & X_a(:,3) >= 0 & X_a(:,3) < d),R_rep(:,1),R_rep(:,2));
A = array2table(reshape(fun_A,R_len,R_len),'RowNames',R_str,'VariableNames',R_str);

% Apply the criterions to C and E (second and fourth rows)...
X_na = X(X(:,1) >= 0,2:end);
fun_B = arrayfun(@(c,e) sum(X_na(:,2) >= 0 & X_na(:,2) < c & X_na(:,4) >= 0 & X_na(:,4) < e),R_rep(:,1),R_rep(:,2));
B = array2table(reshape(fun_B,R_len,R_len),'RowNames',R_str,'VariableNames',R_str);

这篇关于如何管理表/矩阵以使用条件获取信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆