SAS 中的 PROC SQL - 所有项目对 [英] PROC SQL in SAS - All Pairs of Items

查看:28
本文介绍了SAS 中的 PROC SQL - 所有项目对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,我需要在其中查看来自另一组的所有项目对.我在下面创建了一个玩具示例来进一步解释.

I have a dataset in which I need to look at all pairs of items that are together from within another group. I've created a toy example below to further explain.

BUNCH    FRUITS
1        apples
1        bananas
1        mangos
2        apples
3        bananas
3        apples
4        bananas
4        apples

我想要的是所有可能对的列表,并将它们在一组中一起出现的频率相加.理想情况下,我的输出如下所示:

What I want is a listing of all possible pairs and sum the frequency they occur together within a bunch. My output would ideally look like this:

FRUIT1    FRUIT2     FREQUENCY
APPLES    BANANAS    3
APPLES    MANGOS     1

我的最终目标是制作一些我最终能够导入 Gephi 进行网络分析的东西.为此,我需要一个 Source 和 Target 列(也就是上面的 FRUIT1 和 FRUIT2).

My end goal is to make something that I'll eventually be able to import into Gephi for a network analysis. For this I need a Source and Target column (aka FRUIT1 and FRUIT2 above).

我认为还有其他一些方法可以在不使用 PROC SQL(也许使用 PROC TRANSPOSE)的情况下解决这个问题,但这是我开始的地方.

I think there are a few other ways to approach this as well without using PROC SQL (Maybe using PROC TRANSPOSE) but this is where I've started.

解决方案

感谢您的帮助.以下示例代码适用于对类似内容感兴趣的任何人:

Thanks for the help. Sample code below for anyone interested in something similar:

proc sql;
    create table fruit_combo as
    select a.FRUIT as FRUIT1, b.FRUIT as FRUIT2, count(*) as FREQUENCY
    from FRUITS a, FRUITS b
    where a.BUNCH=b.BUNCH and and not a.FRUIT= b.FRUIT
    group by FRUIT1, FRUIT2;
    quit;

推荐答案

最简单的方法是在 t1.ID=t2.ID 和 t1.FRUIT ne t2.FRUIT 上对表进行笛卡尔(完全)联接.这将生成完整的组合集,然后您可以对其进行汇总.

Simplest approach is to do a cartesian (full) join of the table to itself, on t1.ID=t2.ID and t1.FRUIT ne t2.FRUIT. That will generate the full combination set, which you could then summarize.

这篇关于SAS 中的 PROC SQL - 所有项目对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆