如何检查 SAS 中的百分比重叠 [英] How to check percentage overlap in SAS

查看:38
本文介绍了如何检查 SAS 中的百分比重叠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 SAS 新手,需要找到一种方法来执行以下操作:

I'm new to SAS and need to find a way to do the following:

我有两个数据集:

  • 用户(user_id,朋友)(朋友是用,"分隔的user_id)
  • 评论(user_id、review_id、business_id、文本)
  • Users (user_id, friends) (friends are user_ids seperated by a ",")
  • Reviews (user_id, review_id, business_id, text)

我已在 user_id 上合并了两者.现在我需要知道用户朋友的评论中有多少百分比是关于用户评论过的同一家企业.

I've merged both on user_id. Now I need to know what percentage of the reviews of the friends of a user is about the same business(es) a user has reviewed.

我想我需要一个存储过程(但我也是 SQL 新手).任何提示如何开始?

I guess I need a stored procedure for this (but I'm new to SQL also). Any tips how to start on it?

推荐答案

我会从重构您的用户表开始,将朋友存储为单独的记录而不是单个列表值:

I would start with refactoring your users table to store friends as separate records rather than as a single list value:

data users(drop=friends);
set users;
do i=1 to countw(compress(friends_list),',');
  friend=scan(compress(friends_list),i,',');
  output;
end;
run;

然后,您可以通过将带有 reviews 的表格加入两次来计算该百分比,每个用户一次,每个朋友一次:

you can then calculate that percentage by joining that table with reviews twice, once per user and once per friend:

proc sql;
create table want as
select t1.user_id
      ,sum(case when t3.business_id=t2.business_id then 1 else 0 end)/count(*) as percentage
from users t1
inner join reviews t2
  on t1.user_id=t2.user_id
inner join reviews t3
  on t1.friend=t3.user_id
group by t1.user_id
;
quit;

这篇关于如何检查 SAS 中的百分比重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆