如何总结不同年龄组的不同疾病组合? [英] How to summarize different combinations of diseases for different age groups?

查看：17 发布时间：2022/1/8 17:51:12 sas

本文介绍了如何总结不同年龄组的不同疾病组合?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们对在不同时间点和不同年龄组参加研究的人进行了一项研究.他们已经被跟踪了二十年，在此期间他们发展了 1-5 种疾病.这些疾病在不同的时间点发展.以下是 SAS 中示例数据的代码:

We have a study of people who have been enrolled to a study at different time points and from different age groups. They have been followed up for two decades and during this time they have developed 1-5 diseases. The diseases are developing at different time points. Here is the code for an example data in SAS:

proc format;
  value agegrp
    30-39 = '30-39'
    40-49 = '40-49'
    50-59 = '50-59'
    60-69 = '60-69'
    70-79 = '70-79'
  ;
  invalue agegrp
    '30-39' = 30
    '40-49' = 40
    '50-59' = 50
    '60-69' = 60
    '70-79' = 70
  ;
run;

* generate some sample data;
%macro RandBetween(min, max);
   (&min + floor((1+&max-&min)*rand("uniform")))
%mend;


data have;
  call streaminit(123);
 
  do id = 1 to 10000;
    enrolled = '01jan2000'd + (1 + floor((1+3650-1)*rand("uniform")));
    age = 30 + %RandBetween(0, 49);

    flag1 = rand('uniform') < 0.25;
    date1 = enrolled + %RandBetween(0, 2500);

    flag2 = rand('uniform') < 0.25;
    date2 = date1 + %RandBetween(0,2500);

    flag3 = rand('uniform') < 0.25;
    date3 = date2 + %RandBetween(0,2500);

    flag4 = rand('uniform') < 0.25;
    date4 = date3 + %RandBetween(0,2500);

    flag5 = rand('uniform') < 0.25;
    date5 = date4 + %RandBetween(0,2500);
    output;
  end;
 format enrolled date: yymmdd10. flag: 1.;
run;

我已经总结了在基线时他们年龄不同疾病组合的人的比例.但现在我想找出每个年龄组患有不同疾病组合的人数.例如，统计 40-49 岁患有疾病 1+疾病 2 的人数等.比例将是他们在该年龄时占所有个体的比例.

I have summarized the proportion of people with different combinations of disease for their age at the baseline. But now I want to find the number of people having different combinations of diseases at each age group. e.g.to count the number of people who at the age of 40-49 years had disease1+disease2, etc. And the proportion would be the proportion they represent of all individuals while at that age.

输出应如下所示:

Disease combination           30-39  40-49  50-59  60-69  70-79
------------------------------------------------------------------
Combinations of length 2       xx%    yy%  ...
flag1+flag2
flag2+flag3
...


length 3

length 4

length 5

你有什么想法怎么能做到这一点?

Do you have any thoughts how could one do this?

推荐答案

从诊断的角度来看，数据有点不寻常，但是，如果标志用于诊断遵循某种时间进展模型的疾病或疾病家族，则数据可能有道理.

The data is somewhat unusual from a diagnoses standpoint, however, if the flags are for diagnosis of a disease or disease family that follows some temporal progression model the data might make sense.

注意事项

需要为每个标志单独计算疾病标志日期的年龄.
双轴旋转创建宽结构，标志由 at_age 分组分隔
TABULATE 有内置的百分比计算.
疾病标志断言条件存在"状态到其相应疾病名称的映射是使用自定义格式实现的

The age at disease flag date needs to be separately computed for each flag.

Double pivoting creates wide structure with flags segregated by at_age grouping

TABULATE has built-in percentage calculations.

The mapping of a 'disease flag asserts condition is present' state to it's corresponding disease name is effectuated using a custom format

例子:

考虑标记 5 个与可怕的狼人进展相关的诊断.

Consider the flagging of 5 diagnoses related to the dreaded Werewolf progression.

proc format; value agegrp 30-39 = '30-39' 40-49 = '40-49' 50-59 = '50-59' 60-69 = '60-69' 70-79 = '70-79' 80-high = '80 +' ; invalue agegrp '30-39' = 30 '40-49' = 40 '50-59' = 50 '60-69' = 60 '70-79' = 70 ; * flag1 to flag5 are progression of Werewolf!; value $flag_state_to_disease flag1_1='Animal Bite' flag2_1='Hallucination' flag3_1='Onychogryphosis' flag4_1='Hypertrichosis' flag5_1='Hyperdontia' other=' ' ; run; * generate some sample data; %macro RandBetween(min, max); (&min + floor((1+&max-&min)*rand("uniform"))) %mend; data have; call streaminit(123); do id = 1 to 10000; enrolled = '01jan2000'd + (1 + floor((1+3650-1)*rand("uniform"))); age_at_enroll = 30 + %RandBetween(0, 49); flag1 = rand('uniform') < 0.25; date1 = enrolled + %RandBetween(0, 2500); flag2 = rand('uniform') < 0.25; date2 = date1 + %RandBetween(0,2500); flag3 = rand('uniform') < 0.25; date3 = date2 + %RandBetween(0,2500); flag4 = rand('uniform') < 0.25; date4 = date3 + %RandBetween(0,2500); flag5 = rand('uniform') < 0.25; date5 = date4 + %RandBetween(0,2500); output; end; * force a 5 disease situation for each age group; enrolled = '01jan2000'd; do age_at_enroll = 30 to 70 by 10; flag1=1; flag2=1; flag3=1; flag4=1; flag5=1; date1=enrolled+10; date2=date1+10; date3=date2+10; date4=date3+10; date5=date4+10; output; id + 1; end; format enrolled date: yymmdd10. flag: 1.; run; * pivot to tall structure; data tall(keep=id at_age disease); set have; array dates date1-date5; array flags flag1-flag5; * row wise transposition of flags as disease names and computed at_age; do _n_ = 1 to dim(dates); at_age = age_at_enroll + intck('year', enrolled, dates(_n_)); flag_state = catx('_', vname(flags(_n_)), flags(_n_)); disease = put(flag_state, flag_state_to_disease.); output; end; run; * pivot back to wide structure, segregating within id the at_age groups; proc transpose data=tall out=wide1 (label='diseases per id agegroup') prefix=disease; by id at_age; var disease; format at_age agegrp. ; run; * computed values for tabulation; data wide2(keep=at_age disease_count disease_list); set wide1; disease_count = 5 - cmiss(of disease1-disease5); length disease_list $100; disease_list = coalescec (catx(', ', of disease1-disease5), '* NONE *'); run; ods html file='tabulation.html' style=plateau; title; proc tabulate data=wide2; class disease_count disease_list at_age; table disease_count*disease_list , at_age * (n*f=comma9. colpctn) / nocellmerge ; run; ods html close;

HTML 输出图像

这篇关于如何总结不同年龄组的不同疾病组合?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何总结不同年龄组的不同疾病组合? [英] How to summarize different combinations of diseases for different age groups?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何总结不同年龄组的不同疾病组合? [英] How to summarize different combinations of diseases for different age groups?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭