如何管理表/矩阵以使用条件获取信息 [英] How to manage a table/matrix to obtain information using conditions
问题描述
我一直在考虑解决这个问题的最佳方法是什么?
Ive been thinking about what is the best way to solve this, any advise?
表格/矩阵X为:
X <- read.table(text = " a b c d e
0 27 0 28 8
1 14 24 32 33
0 4 22 25 27
0 3 7 26 34
0 28 33 31 21
0 16 17 24 18
1 3 19 0 12
0 2 23 5 24
2 17 22 22 10
0 35 15 17 2",
header = TRUE, stringsAsFactors = FALSE)
或使用matlab:
X =[
0 27 0 28 8
1 14 24 32 33
0 4 22 25 27
0 3 7 26 34
0 28 33 31 21
0 16 17 24 18
1 3 19 0 12
0 2 23 5 24
2 17 22 22 10
0 35 15 17 2];
如何获取包含以下内容的表/矩阵A和B:
How could I obtain a table/matrix A and B which contains:
行和列每5个值累加一次,直到表X->的最大值,在这种情况下:0-5; 0-10; 0-15; 0-20 .. 0-35(对于两行)和列).
表A:
第一件事是只选择a的正值,然后计算这些条件发生的次数:
The first thing is to select only the positive values of a, and then count the number of times these conditions occur:
-
表/矩阵A的条目(1,1)将是 b 值介于0-5和 d 值之间的次数在0-5之间.条目(2,1)是 b 值在0-5之间和 d 值在0-10之间的次数.条目(3,1)是 b 值介于0-5之间和 d 值介于0-15之间的次数...直到行是0-35.
The entry (1,1) of the table/matrix A would be the number of times the b values are among 0-5 and d values are among 0-5. The entry (2,1) would be the number of times the b values are among 0-5 and d values are among 0-10. The entry (3,1) would be the number of times the b values are among 0-5 and d values are among 0-15... this until the row is 0-35.
表/矩阵A的条目(1,2)将是 b 值在0-10和 d 值之间的次数在0-5之间.条目(2,2)是 b 值在0-10之间且 d 值在0-10之间的次数.条目(3,2)将是 b 值在0-10之间并且 d 值在0-15之间的次数...直到行是0-35.
The entry (1,2) of the table/matrix A would be the number of times the b values are among 0-10 and d values are among 0-5. The entry (2,2) would be the number of times the b values are among 0-10 and d values are among 0-10. The entry (3,2) would be the number of times the b values are among 0-10 and d values are among 0-15... this until the row is 0-35.
我们形成一个7x7的表格/矩阵,结果显示为:
A:
0-5 0-10 0-15 0-20 0-25 0-30 0-35
0-5 1 1 1 1 1 1 1
0-10 1 1 1 1 1 1 1
0-15 1 1 1 1 1 1 1
0-20 1 1 1 1 1 1 1
0-25 1 1 1 2 2 2 2
0-30 1 1 1 2 2 2 2
0-35 1 1 2 3 3 3 3
列是b值的间隔,行是d值的间隔
表/矩阵A的含义是a值大于0并匹配b和d列中不同组合间隔(每个5个值)的次数. 例如,对于结果表A的条目(1,1)为1,因为在表X的第7行中,只有1个数字,其中a> 0,b = 3(在0-5之间)和d = 0(在0-5之间). 另一个例子是条目(7,4),由于出现了3次,所以为3:a> 0,b在0-20之间,d在0-35之间
The meaning of the table/matrix A is the number of times the a values are higher than 0 and match different combinations intervals (each 5 values) among column b and d. For example, for the entry (1,1) of result table A is 1 because in the row 7 of table X, there is only 1 number which a>0, b=3 (among 0-5) and d=0 (among 0-5). Other example is the entry (7,4), which is 3 since there three times: a>0, b is among 0-20 and d is among 0-35
表/矩阵B将与表A相同,但比较列c和e. B是:
The table/matrix B, would be the same table A but comparing columns c and e. B is:
B:
0-5 0-10 0-15 0-20 0-25 0-30 0-35
0-5 1 1 1 1 1 1 1
0-10 2 2 2 2 2 2 2
0-15 2 2 2 2 2 2 2
0-20 2 2 2 2 2 2 3
0-25 2 2 2 4 4 4 5
0-30 4 4 4 6 6 7 8
0-35 4 4 5 7 7 9 10
推荐答案
这是我在Matlab下为您提供的解决方案(它只负责创建第一个表,但对于第二个表,代码基本上是相同的):
Here is the solution I propose you under Matlab (it only takes care of creating the first table, but for the second one the code is basically the same):
% Define the sample data...
X = [
0 27 0 28 8
1 14 24 32 33
0 4 22 25 27
0 3 7 26 34
0 28 33 31 21
0 16 17 24 18
1 3 19 0 12
0 2 23 5 24
2 17 22 22 10
0 35 15 17 2
];
% Find the upper bound (as a multiple of 5) and build the ranges...
X_max = max(max(X(:,2:end)));
X_upper = round(X_max / 5) * 5;
% Create the required ranges and their respective string representation...
R = 5:5:X_upper;
R_len = numel(R);
R_str = sprintfc('0-%d',R);
% Filter the table rows keeping only those with A > 0...
X = X(X(:,1) > 0,2:end);
% Apply the criterions to B and D (first and third rows)...
b = arrayfun(@(x) sum(X(:,1) > 0 & X(:,1) <= x),R).';
d = arrayfun(@(x) sum(X(:,3) > 0 & X(:,3) <= x),R);
% Cross the results of both computations into a single matrix...
A = num2cell(repmat(b,1,R_len) + repmat(d,R_len,1));
A = [[{''}; R_str.'] [R_str; A]];
代码中的注释应该使所有内容变得不言自明,但是如果您有任何疑问,请随时询问更多详细信息.对于最终结果A
的类型,我选择了一个单元格矩阵,因为表的行和列遵循严格的命名约定,但是从此输出切换到基于表的输出非常容易.例如:
The comments in the code should make everything pretty self-explanatory, but if you have any doubts feel free to ask for more details. For what concerns the type of the final result A
, I opted for a matrix of cells, because table rows and columns are subject to strict naming conventions, but it's very easy to switch from this output to a table-based one... for example:
% Cross the results of both computations into a single table...
A = array2table(repmat(b,1,R_len) + repmat(d,R_len,1),'RowNames',R_str,'VariableNames',R_str);
编辑
此代码根据OP要求正确处理数据:
This code properly handles data as per OP requirements:
% Find the upper bound (as a multiple of 5) and build the ranges...
X_max = max(max(X(:,2:end)));
X_upper = round(X_max / 5) * 5;
% Create the required ranges and their respective string representation...
R = 5:5:X_upper;
R_len = numel(R);
R_str = sprintfc('R_0_%d',R);
% Adjust the range for exclusive limits and project it...
R(end) = R(end) + 1;
R_rep = [repelem(R,1,7).' repmat(R,1,7).'];
% Apply the criterions to B and D (first and third rows)...
X_a = X(X(:,1) > 0,2:end);
fun_A = arrayfun(@(b,d) sum(X_a(:,1) >= 0 & X_a(:,1) < b & X_a(:,3) >= 0 & X_a(:,3) < d),R_rep(:,1),R_rep(:,2));
A = array2table(reshape(fun_A,R_len,R_len),'RowNames',R_str,'VariableNames',R_str);
% Apply the criterions to C and E (second and fourth rows)...
X_na = X(X(:,1) >= 0,2:end);
fun_B = arrayfun(@(c,e) sum(X_na(:,2) >= 0 & X_na(:,2) < c & X_na(:,4) >= 0 & X_na(:,4) < e),R_rep(:,1),R_rep(:,2));
B = array2table(reshape(fun_B,R_len,R_len),'RowNames',R_str,'VariableNames',R_str);
这篇关于如何管理表/矩阵以使用条件获取信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!