如何获得每个组的识别号? [英] How can I get the identification number with each groups?
问题描述
以下是我的数据表简介,
The following is a brief of my data sheet,
stnd_y person_id recu_day date
2002 100 20020929 02-09-29
2002 100 20020930 02-09-30
2002 100 20021002 02-10-02
2002 101 20020927 02-09-27
2002 101 20020928 02-09-28
2002 102 20021001 02-10-01
2002 103 20021003 02-10-03
2002 104 20021108 02-11-08
2002 104 20021112 02-11-12
而且,我想让它们如下
stnd_y person_id recu_day date Admission
2002 100 20020929 02-09-29 1
2002 100 20020930 02-09-30 2
2002 100 20021002 02-10-02 3
2002 101 20020927 02-09-27 1
2002 101 20020928 02-09-28 2
2002 102 20021001 02-10-01 1
2002 103 20021003 02-10-03 1
2002 104 20021108 02-11-08 1
2002 104 20021112 02-11-12 2
我的意思是,我想用 recu_day
和 date
(这个变量表示住院日期)为住院频率创建一个变量.
I mean, I want to make a variable for admission frequency personally with recu_day
and date
(this variables mean the date of hospitalization).
然后,我在 sas 中使用了以下内容,
And then, I used the following with sas,
proc sort data=old out=new;
by person_id recu_day;
data new1;
set new;
retain admission 0;
by person_id recu_day;
if recu_day^=lag(recu_day) and(or) person_id^=lag(person_id) then
admission+1;
run;
还有,
data new1;
set new ;
by person_id recu_day;
retain adm 0;
if first.person_id and(or) first.recu_day then admission=admission+1;
run;
但是,这些都不起作用.我该如何解决这个问题?请让我知道这件事.
But, those are not working. How can I solve this? Please let me know about this.
推荐答案
您已经非常接近第二次尝试了,但是您的主要问题是您没有在每次 person_id 更改时重置准入.
You're pretty close with the 2nd attempt, but your main problem is that you don't reset admission each time person_id changes.
也没有必要使用 first.recu_day
,因为对于示例数据中的每条记录,它都是 1.first.person_id
就足够了,因为如果 peson_id 与前一行相比没有改变,您希望将数字增加 1.
It's also not necessary to use first.recu_day
as this is 1 for every record in your sample data. first.person_id
is sufficient as you want to increment the number by 1 if the peson_id hasn't changed from the previous row.
在 by
语句中包含 recu_day 很有用,因为如果数据没有正确排序,这将导致错误.
Including recu_day in the by
statement is useful however, as this will force an error if the data isn't sorted properly.
data have;
input stnd_y person_id recu_day date :yymmdd8.;
format date yymmdd8.;
datalines;
2002 100 20020929 02-09-29
2002 100 20020930 02-09-30
2002 100 20021002 02-10-02
2002 101 20020927 02-09-27
2002 101 20020928 02-09-28
2002 102 20021001 02-10-01
2002 103 20021003 02-10-03
2002 104 20021108 02-11-08
2002 104 20021112 02-11-12
;
run;
data want;
set have;
by person_id recu_day;
if first.person_id then admission=0;
admission+1;
run;
这篇关于如何获得每个组的识别号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!