确定客户群体/网络 [英] Identifying groups/networks of customers

查看:10
本文介绍了确定客户群体/网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建由交易之间的客户交互确定的唯一客户组。

以下是数据示例:

交易编号 主要客户 联署人 招聘:客户组
1 1 2 A
2 1 3 A
3 1 4 A
4 1 2 A
5 2 5 A
6 3 6 A
7 2 1 A
8 3 1 A
9 7 8 B
10 9 C
在本例中,客户1直接或间接连接到客户2-6,因此与客户1-6关联的所有交易都将是&q;A&q;组的一部分。客户7和8直接相连,将被标记为A&Q;B&Q&Q;组。客户9没有任何联系,并且是";C";组的单个成员。

如有任何建议,我们将不胜感激!

推荐答案

您的数据可以被视为graph的边缘。所以你的要求是找到该图的连通子图。这个问题在StackoverflowSAS Communities上有一个答案。但这个问题比那个更古老的SO问题更具主题意义。因此,让我们将来自SAS社区的subnet SAS macro答案发布在这里,这样更容易找到它。

这个简单的宏使用重复的过程SQL查询来构建连接的子图的列表,直到所有原始记录都已分配给子图。

宏设置为允许您传入源数据集的名称和保存节点ID的两个变量的名称。

因此,首先让我们将打印输出转换为实际的SAS数据集。

data have;
  input id primary cosign want $;
cards;
1 1 2 A
2 1 3 A
3 1 4 A
4 1 2 A
5 2 5 A
6 3 6 A
7 2 1 A
8 3 1 A
9 7 8 B
10 9 . C
;
现在,我们可以调用宏并告诉它,PRIMARY和COSIGN是具有节点ID的变量,而该子网是保存连接的子图ID的新变量的名称。注意:默认情况下,此版本将图表视为定向图表。

%subnet(in=have,out=want,from=primary,to=cosign,subnet=subnet);

结果:

Obs    id    primary    cosign    want    subnet

  1     1       1          2        A        1
  2     2       1          3        A        1
  3     3       1          4        A        1
  4     4       1          2        A        1
  5     5       2          5        A        1
  6     6       3          6        A        1
  7     7       2          1        A        1
  8     8       3          1        A        1
  9     9       7          8        B        2
 10    10       9          .        C        3

以下是%SUBNET()宏的代码。

%macro subnet(in=,out=,from=from,to=to,subnet=subnet,directed=1);
/*----------------------------------------------------------------------
SUBNET - Build connected subnets from pairs of nodes.
Input Table :FROM TO pairs of rows
Output Table:input data with &subnet added
Work Tables:
  NODES - List of all nodes in input.
  NEW - List of new nodes to assign to current subnet.

Algorithm:
Pick next unassigned node and grow the subnet by adding all connected
nodes. Repeat until all unassigned nodes are put into a subnet.

To treat the graph as undirected set the DIRECTED parameter to 0.
----------------------------------------------------------------------*/
%local subnetid next getnext ;
%*----------------------------------------------------------------------
Put code to get next unassigned node into a macro variable. This query 
is used in two places in the program.
-----------------------------------------------------------------------;
%let getnext= select node into :next from nodes where subnet=.;
%*----------------------------------------------------------------------
Initialize subnet id counter.
-----------------------------------------------------------------------;
%let subnetid=0;
proc sql noprint;
*----------------------------------------------------------------------;
* Get list of all nodes ;
*----------------------------------------------------------------------;
  create table nodes as
    select . as subnet, &from as node from &in where &from is not null
    union
    select . as subnet, &to as node from &in where &to is not null
  ;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
  &getnext;
%do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Set subnet to next id ;
*----------------------------------------------------------------------;
  %let subnetid=%eval(&subnetid+1);
  update nodes set subnet=&subnetid where node=&next;
  %do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Get list of connected nodes for this subnet ;
*----------------------------------------------------------------------;
    create table new as
      select distinct a.&to as node
        from &in a, nodes b, nodes c
        where a.&from= b.node
          and a.&to= c.node
          and b.subnet = &subnetid
          and c.subnet = .
    ;
%if "&directed" ne "1" %then %do;
    insert into new 
      select distinct a.&from as node
        from &in a, nodes b, nodes c
        where a.&to= b.node
          and a.&from= c.node
          and b.subnet = &subnetid
          and c.subnet = .
    ;
%end;
*----------------------------------------------------------------------;
* Update subnet for these nodes ;
*----------------------------------------------------------------------;
    update nodes set subnet=&subnetid
      where node in (select node from new )
    ;
  %end;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
  &getnext;
%end;
*----------------------------------------------------------------------;
* Create output dataset by adding subnet number. ;
*----------------------------------------------------------------------;
  create table &out as
    select distinct a.*,b.subnet as &subnet
      from &in a , nodes b
      where a.&from = b.node
  ;
quit;
%mend subnet ;

这篇关于确定客户群体/网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆