在 SAS 中:如何按组合并行中的非零值 [英] In SAS: How to consolidate non zero values in rows by group
问题描述
我有一个由变量 ObservationNumber
、MeasurementNumber
、SubjectID
和许多虚拟变量组成的数据集.
I have a dataset consisting of variables ObservationNumber
, MeasurementNumber
, SubjectID
, and many dummy variables.
我想通过 SubjectID GroupNumber 将所有非零值合并为一行.
I would like to consolidate all non-zero values into one row by SubjectID GroupNumber.
有:
ObsNum MeasurementNum SubjectID Dummy0 Dummy1 ... Dummy999
----------------------------------------------------...---------------
01 1 1 0 1 ... 0
02 2 1 0 1 ... 0
03 3 1 0 1 ... 0
04 4 1 0 0 ... 0
05 5 1 - - ... -
06 6 1 0 0 ... 0
07 1 2 1 0 ... 0
08 2 2 0 0 ... 0
09 3 2 0 1 ... 0
10 4 2 1 0 ... 0
11 4 2 0 1 ... 0
12 5 2 0 0 ... 1
13 6 2 0 0 ... 0
14 6 2 0 0 ... 1
15 6 2 0 0 ... 0
16 6 2 0 0 ... 0
17 6 2 0 1 ... 0
18 6 2 0 0 ... 0
19 6 2 0 0 ... 0
20 6 2 0 0 ... 0
21 6 2 1 0 ... 0
22 1 3 1 0 ... 0
23 2 3 0 1 ... 0
24 3 3 0 0 ... 1
25 4 3 - - ... -
26 5 3 0 0 ... 0
27 6 3 0 0 ... 0
28 1 4 - - ... -
29 2 4 0 0 ... 0
30 3 4 0 1 ... 0
31 4 4 1 0 ... 0
32 4 4 0 1 ... 0
33 4 4 0 0 ... 1
34 5 4 0 0 ... 1
35 6 4 0 1 ... 0
36 6 4 0 0 ... 1
想要:
MeasurementNum SubjectID Dummy0 Dummy1 ... Dummy999
----------------------------------------------------...---------------
1 1 0 1 ... 0
2 1 0 1 ... 0
3 1 0 1 ... 0
4 1 0 0 ... 0
5 1 - - ... -
6 1 0 0 ... 0
1 2 1 0 ... 0
2 2 0 0 ... 0
3 2 0 1 ... 0
4 2 1 1 ... 0
5 2 0 0 ... 1
6 2 1 1 ... 1
1 3 1 0 ... 0
2 3 0 1 ... 0
3 3 0 0 ... 1
4 3 - - ... -
5 3 0 0 ... 0
6 3 0 0 ... 0
1 4 - - ... -
2 4 0 0 ... 0
3 4 0 1 ... 0
4 4 1 1 ... 1
5 4 0 0 ... 1
6 4 0 1 ... 1
每个 SubjectID 有六个测量,其中测量了一组哑变量,没有结果 0、1 或缺失.如果出现缺失值,则相应观测值的所有虚拟变量都将丢失——并且该MeasurementNumber"的数据集中仅存在一个观测值.
Each SubjectID has six measurement in which a set of dummyvariables are measured without outcome 0, 1 or missing. If a missing value occurs, all dummy variables for the respective observation are missing--and only one observation will be present in the dataset for that `MeasurementNumber.
我曾尝试使用UPDATE
语句,但它似乎无法处理0"和-".
I have tried to use the UPDATE
statement, but it seems to not be able to deal with '0' and '-'.
对于按MeasurementNumber
分组的每个SubjectID
,是否有直接压缩该数据集中所有虚拟变量的方法?
Is there a direct way of condensing all dummyvariables in this dataset for each SubjectID
grouped by MeasurementNumber
?
推荐答案
使用 Proc MEANS
和 BY
和 OUTPUT
语句.
Use Proc MEANS
with BY
and OUTPUT
statements.
data have;
rownum = 0;
do rowid = 1 to 1000;
subjectid + 1;
do measurenum = 1 to 6;
do repeat = 1 to ceil(4 * ranuni(123));
array flags flag1-flag999;
do _n_ = 1 to dim(flags);
flags(_n_) = ranuni(123) < 0.10;
if subjectid < 7 and measurenum = subjectid then flags(_n_) = .;
end;
rownum + 1;
output;
end;
end;
end;
keep rownum measurenum subjectid flag:;
run;
proc means noprint data=have;
by subjectid measurenum;
var flag:;
output max=;
run;
这篇关于在 SAS 中:如何按组合并行中的非零值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!