SAS哈希表(右联接/联合) [英] SAS Hash Table (Right Join/Union)

查看:161
本文介绍了SAS哈希表(右联接/联合)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种在联接和联合之间混合的查找方式.我的主要数据集中有大量记录,因此我希望做的事情不是多对多矩阵的强力"方法.

I'm looking to do a lookup that is sort of a hybrid between a join and union. I have a large number of records in my main dataset, so I'm looking to do something that wouldn't be a "brute force" method of a many-to-many matrix.

这是我的主要数据集,称为全部",其中已经包含列出的每种产品的价格.

Here is my main dataset, called 'All', which already contains price for each of the products listed.

product date        price   
apple   1/1/2011    1.05    
apple   1/3/2011    1.02
apple   1/4/2011    1.07

pepper  1/2/2011    0.73
pepper  1/3/2011    0.75
pepper  1/6/2011    0.79

我的其他数据集(价格"-此处未显示,但包含相同的两个键,产品和日期)包含所有产品在每个可能日期的价格.我想创建的哈希表查找实际上将在所有"表中查找每个日期,并在该日期输出 ALL 产品的价格,从而得到一个这样的表:

My other data dataset ('Prices' - not shown here, but contains the same two keys, product and date) contains prices for all products, on each possible date. The hash table look up I would like to create would essentially look up every date in the 'All' table, and output prices for ALL products for that date, resulting in a table such as this:

product date        price
apple   1/1/2011    1.05    
pepper  1/1/2011    0.71 *
apple   1/2/2011    1.04 *
pepper  1/2/2011    0.73
apple   1/3/2011    1.02
pepper  1/3/2011    0.75
apple   1/4/2011    1.07
pepper  1/4/2011    0.76 *
apple   1/6/2011    1.10 *
pepper  1/6/2011    0.79

也就是说,只要一种产品的日期和价格指定为全部"表,其他所有产品都应将其从查找表中提取出来. 星号表示价格是从价格表中查找的,包含产品价格的新行实际上已插入到新表中.

That is, as long as one product has a date and price specified 'All' table, all other products should pull that in from the lookup table as welll. The asterisks indicate that the price was looked up from the prices table, and new rows containing prices for products were essentially inserted into the new table.

如果哈希表不是解决此问题的好方法,请让我知道替代方法.

If hash table are not a great way to go about this, please let me know alternative methods.

推荐答案

这远非优雅,但很好奇下面的内容是否能为您带来理想的结果?由于您在ALL中每个键都有多个记录(我假设您要维护),因此我基本上将ALL与PRICES中所有日期中有日期的记录合并在一起,但是我添加了一个Except以便排除ALL中已经存在的记录.不知道这是否有意义,或者正在做您想要的事情.当然不符合优雅"的条件.

Well this is far from elegant, but curious if the below gives you the desired result? Since you have multiple records per key in ALL (which I assume you want to maintain), I basically unioned ALL with the records in PRICES that have a date in All, but I added an Except so as to excluded records that were already in ALL. No idea if this makes sense, or is doing what you want. Certainly doesn't qualify as 'elegant'.

data all;
  input product $7. date mmddyy10. price;
  Y=1;
  format date mmddyy10.;
  cards;
apple  01/01/2011  1.05
apple  01/01/2011  1.05
apple  01/03/2011  1.02
pepper 01/02/2011  0.73
pepper 01/03/2011  0.75
pepper 01/06/2011  0.79
;
run;
data prices;
  input product $7. date mmddyy10. price;
  format date mmddyy10.;
  cards;
apple  01/01/2011  1.05
apple  01/02/2011  1.04
apple  01/03/2011  1.02
apple  01/04/2011  1.07
apple  01/05/2011  1.01
pepper 01/01/2011  0.70
pepper 01/02/2011  0.73
pepper 01/03/2011  0.75
pepper 01/04/2011  0.76
pepper 01/05/2011  0.77
pepper 01/06/2011  0.79
;
run;

proc sql;
  create table want as 
  select * from all 
  union corr all
  ( (select product,date,price from
      prices
      where date IN (select distinct date from all)
    )
    except corr
    select product,date,price from all
  )
  ;
quit;

这篇关于SAS哈希表(右联接/联合)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆