使用arulesSequences在R中进行cSPADE数据挖掘-转换为“交易"时出错格式 [英] cSPADE data mining in R using arulesSequences - Error while converting to "transactions" format

查看:130
本文介绍了使用arulesSequences在R中进行cSPADE数据挖掘-转换为“交易"时出错格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法将数据转换为cSPADE兼容格式.

我的数据框看起来像-

 key type1 type2 type3 
 A-1  A     B     C
 B-2  P     Q    NA
 C-3  X     NA   NA

当我使用时, dataset1<- as(dataset, "transactions") 然后运行

rules<- cspade(dataset1, parameter = list(support = 0.4), control = list(verbose = TRUE))

它抛出一个错误- Error in cspade(dataset1, parameter = list(support = 0.4), control = list(verbose = TRUE)) : slot transactionInfo: missing 'sequenceID' or 'eventID'

任何人都可以将上述数据集转换为cSPADE兼容格式吗?

解决方案

使用此方法

这种格式的源数据集:

1 3 A B C
2 2 P Q    
3 1 X

第一列用于序列的ID,第二列用于序列的长度,然后是序列的元素. 然后:

data <- read_baskets(con = "./input_file.txt", info = c("sequenceID","eventID","SIZE"))
rules<- cspade(data, parameter = list(support = 0.4), control = list(verbose = TRUE))

让我知道这是否可行.

这是我的输出:

parameter specification:
support : 0.4
maxsize :  10
maxlen  :  10

algorithmic control:
bfstype  : FALSE
verbose  :  TRUE
summary  : FALSE
tidLists : FALSE

preprocessing ... 1 partition(s), 0 MB [0.1s]
mining transactions ... 0 MB [0.06s]
reading sequences ... [0s]

total elapsed time: 0.16s

 > inspect(rules)
items   support 
1 <{B}> 0.3333333 
2 <{C}> 0.3333333 
3 <{Q}> 0.3333333 
4 <{B,   
 C}> 0.3333333

I'm having trouble converting my data into cSPADE compatible format.

My data frame looks like-

 key type1 type2 type3 
 A-1  A     B     C
 B-2  P     Q    NA
 C-3  X     NA   NA

When I use, dataset1<- as(dataset, "transactions") and run-

rules<- cspade(dataset1, parameter = list(support = 0.4), control = list(verbose = TRUE))

It throws an error - Error in cspade(dataset1, parameter = list(support = 0.4), control = list(verbose = TRUE)) : slot transactionInfo: missing 'sequenceID' or 'eventID'

Can anyone please help as to how can the above dataset be converted into the cSPADE compatible format?

解决方案

itry with this:

source dataset in this format:

1 3 A B C
2 2 P Q    
3 1 X

the first column is for the id of sequence, the second columns is for the length of sequence and then the elements of sequences. Then:

data <- read_baskets(con = "./input_file.txt", info = c("sequenceID","eventID","SIZE"))
rules<- cspade(data, parameter = list(support = 0.4), control = list(verbose = TRUE))

let me know if this works.

This is my output:

parameter specification:
support : 0.4
maxsize :  10
maxlen  :  10

algorithmic control:
bfstype  : FALSE
verbose  :  TRUE
summary  : FALSE
tidLists : FALSE

preprocessing ... 1 partition(s), 0 MB [0.1s]
mining transactions ... 0 MB [0.06s]
reading sequences ... [0s]

total elapsed time: 0.16s

 > inspect(rules)
items   support 
1 <{B}> 0.3333333 
2 <{C}> 0.3333333 
3 <{Q}> 0.3333333 
4 <{B,   
 C}> 0.3333333

这篇关于使用arulesSequences在R中进行cSPADE数据挖掘-转换为“交易"时出错格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆