确定客户群体/网络 [英] Identifying groups/networks of customers
本文介绍了确定客户群体/网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试创建由交易之间的客户交互确定的唯一客户组。
以下是数据示例:
交易编号 | 主要客户 | 联署人 | 招聘:客户组 |
---|---|---|---|
1 | 1 | 2 | A |
2 | 1 | 3 | A |
3 | 1 | 4 | A |
4 | 1 | 2 | A |
5 | 2 | 5 | A |
6 | 3 | 6 | A |
7 | 2 | 1 | A |
8 | 3 | 1 | A |
9 | 7 | 8 | B |
10 | 9 | C |
如有任何建议,我们将不胜感激!
推荐答案
您的数据可以被视为graph的边缘。所以你的要求是找到该图的连通子图。这个问题在Stackoverflow和SAS Communities上有一个答案。但这个问题比那个更古老的SO问题更具主题意义。因此,让我们将来自SAS社区的subnet SAS macro答案发布在这里,这样更容易找到它。
这个简单的宏使用重复的过程SQL查询来构建连接的子图的列表,直到所有原始记录都已分配给子图。
宏设置为允许您传入源数据集的名称和保存节点ID的两个变量的名称。
因此,首先让我们将打印输出转换为实际的SAS数据集。
data have;
input id primary cosign want $;
cards;
1 1 2 A
2 1 3 A
3 1 4 A
4 1 2 A
5 2 5 A
6 3 6 A
7 2 1 A
8 3 1 A
9 7 8 B
10 9 . C
;
现在,我们可以调用宏并告诉它,PRIMARY和COSIGN是具有节点ID的变量,而该子网是保存连接的子图ID的新变量的名称。注意:默认情况下,此版本将图表视为定向图表。
%subnet(in=have,out=want,from=primary,to=cosign,subnet=subnet);
结果:
Obs id primary cosign want subnet
1 1 1 2 A 1
2 2 1 3 A 1
3 3 1 4 A 1
4 4 1 2 A 1
5 5 2 5 A 1
6 6 3 6 A 1
7 7 2 1 A 1
8 8 3 1 A 1
9 9 7 8 B 2
10 10 9 . C 3
以下是%SUBNET()宏的代码。
%macro subnet(in=,out=,from=from,to=to,subnet=subnet,directed=1);
/*----------------------------------------------------------------------
SUBNET - Build connected subnets from pairs of nodes.
Input Table :FROM TO pairs of rows
Output Table:input data with &subnet added
Work Tables:
NODES - List of all nodes in input.
NEW - List of new nodes to assign to current subnet.
Algorithm:
Pick next unassigned node and grow the subnet by adding all connected
nodes. Repeat until all unassigned nodes are put into a subnet.
To treat the graph as undirected set the DIRECTED parameter to 0.
----------------------------------------------------------------------*/
%local subnetid next getnext ;
%*----------------------------------------------------------------------
Put code to get next unassigned node into a macro variable. This query
is used in two places in the program.
-----------------------------------------------------------------------;
%let getnext= select node into :next from nodes where subnet=.;
%*----------------------------------------------------------------------
Initialize subnet id counter.
-----------------------------------------------------------------------;
%let subnetid=0;
proc sql noprint;
*----------------------------------------------------------------------;
* Get list of all nodes ;
*----------------------------------------------------------------------;
create table nodes as
select . as subnet, &from as node from &in where &from is not null
union
select . as subnet, &to as node from &in where &to is not null
;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
&getnext;
%do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Set subnet to next id ;
*----------------------------------------------------------------------;
%let subnetid=%eval(&subnetid+1);
update nodes set subnet=&subnetid where node=&next;
%do %while (&sqlobs) ;
*----------------------------------------------------------------------;
* Get list of connected nodes for this subnet ;
*----------------------------------------------------------------------;
create table new as
select distinct a.&to as node
from &in a, nodes b, nodes c
where a.&from= b.node
and a.&to= c.node
and b.subnet = &subnetid
and c.subnet = .
;
%if "&directed" ne "1" %then %do;
insert into new
select distinct a.&from as node
from &in a, nodes b, nodes c
where a.&to= b.node
and a.&from= c.node
and b.subnet = &subnetid
and c.subnet = .
;
%end;
*----------------------------------------------------------------------;
* Update subnet for these nodes ;
*----------------------------------------------------------------------;
update nodes set subnet=&subnetid
where node in (select node from new )
;
%end;
*----------------------------------------------------------------------;
* Get next unassigned node ;
*----------------------------------------------------------------------;
&getnext;
%end;
*----------------------------------------------------------------------;
* Create output dataset by adding subnet number. ;
*----------------------------------------------------------------------;
create table &out as
select distinct a.*,b.subnet as &subnet
from &in a , nodes b
where a.&from = b.node
;
quit;
%mend subnet ;
这篇关于确定客户群体/网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文