比较并保持相同的角色观察 [英] Compare and keep character observations that are the same

查看:105
本文介绍了比较并保持相同的角色观察的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据集:

Date    ID    Company        
Jan05   1     Coca-Cola      
Jan05   2     Coca-Cola      
Jan05   3     Coca-Cola          
Jan05   4     Apple          
Jan05   5     Apple          
Jan05   6     Apple
Jan05   7     Microsoft     
Feb05   1     McDonald       
Feb05   2     McDonald       
Feb05   3     McDonald
Feb05   4     McDonald       
Feb05   5     McDonald       
Feb05   6     Microsoft        
 .
 .
 .
Jan06   1     Apple      
Jan06   2     Apple     
Jan06   3     Apple          
Jan06   4     Apple          
Jan06   5     Apple          
Jan06   6     Apple
Jan06   7     Apple     
Feb06   1     McDonald       
Feb06   2     McDonald       
Feb06   3     McDonald
Feb06   4     McDonald       
Feb06   5     McDonald       
Feb06   6     Lenova  
Feb06   7     Lenova       
 .
 .
Jan07   1     Apple      
Jan07   2     Apple     
Jan07   3     Apple          
Jan07   4     Microsoft          
Jan07   5     Lenovo          
Jan07   6     Apple
Jan07   7     Apple     
Feb07   1     TJmax       
Feb07   2     TJMax       
Feb07   3     TJMax
Feb07   4     TJMax       
Feb07   5     TJMax       
Feb07   6     TJMax  
Feb07   7     TJMax          
.
.
.
.
until July15

我想要做的是以下几点: 1:比较1月05日与1月06日,然后比较1月06日与1月7日... 2月05日与2月06日,2月06日与2月07日...对于两个日期. 2:每次计算ID的中间值时,我都不需要新的数据集.我只想确保在Jan05和Jan06中都存在两家公司,然后计算ID的中位数.

What I want to do are the following: 1: Compare January 05 with January 06, then January 06 with January 07...February 05 with February 06, February 06 with February 07....so on for each month get compute a median for ID when the same companies are present for both dates. 2: I don't want a new dataset each time I compute a median for ID. I merely want to make sure that both companies are present for lets say in Jan05 and Jan06, then compute a median for ID.

在SAS中执行此操作的最佳方法是什么?

Whats the best way to do this in SAS?

我的最终结果将如下所示:

My end result will look like this:

Date    Median_ID    
Jan05      2         
Jan06      4

Jan06      4     
Jan07      3

Feb05      3     
Feb06      3

Feb06      0
Feb07      0

从结果中可以看到:在1月5日和06日,唯一匹配的公司是Apple.在1月6日和1月7日,唯一匹配的公司是Apple. 因此,我们以公司匹配时间的ID中位数为准.

As you can see from the result: In Jan05 and 06, the only company that matches is Apple. In Jan06 and Jan07, the only company that matches again is Apple. So we take the median of ID for the time the companies match.

推荐答案

目前尚不清楚如何根据示例数据计算出最终结果-如果您包括所有中间步骤,则更容易按照您的解释进行操作一个月,例如Jan05.但是,您似乎可以使用类似于以下内容的SQL进行处理:

It isn't clear how you've calculated the end results from your sample data - it would be easier to follow your explanation if you included all the intermediate steps for one month, e.g. Jan05. However, this seems like something that you could approach with some SQL similar to the following:

data have;
input Date monyy5. ID Company $32.;
format Date monyy5.;
cards;
Jan05   1     Coca-Cola      
Jan05   2     Coca-Cola      
Jan05   3     Coca-Cola          
Jan05   4     Apple          
Jan05   5     Apple          
Jan05   6     Apple
Jan05   7     Microsoft     
Feb05   1     McDonald       
Feb05   2     McDonald       
Feb05   3     McDonald
Feb05   4     McDonald       
Feb05   5     McDonald       
Feb05   6     Microsoft        
Jan06   1     Apple      
Jan06   2     Apple     
Jan06   3     Apple          
Jan06   4     Apple          
Jan06   5     Apple          
Jan06   6     Apple
Jan06   7     Apple     
Feb06   1     McDonald       
Feb06   2     McDonald       
Feb06   3     McDonald
Feb06   4     McDonald       
Feb06   5     McDonald       
Feb06   6     Lenova  
Feb06   7     Lenova       
Jan07   1     Apple      
Jan07   2     Apple     
Jan07   3     Apple          
Jan07   4     Microsoft          
Jan07   5     Lenovo          
Jan07   6     Apple
Jan07   7     Apple     
Feb07   1     TJmax       
Feb07   2     TJMax       
Feb07   3     TJMax
Feb07   4     TJMax       
Feb07   5     TJMax       
Feb07   6     TJMax  
Feb07   7     TJMax   
;
run;

proc sql;
    create table want as
        select a.date, median(a.ID) as Median_ID from have a inner join have b 
            on  month(a.date)= month(b.date) 
            and year(a.date) = year(b.date) - 1
            and a.ID         = b.ID
            and a.company    = b.company
        group by a.date
        ;
quit;   

这篇关于比较并保持相同的角色观察的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆