将项目信息添加到规则中的交易对象 [英] Adding item information to transaction object in arules
问题描述
我正在使用arules
包在销售点零售数据中查找关联规则.我正在从数据库中提取交易明细,然后放置在transaction
对象中.我是arules
的新手,正在尝试弄清楚如何在事务对象中填充itemInfo
数据框.现在,我只是引入transaction
和item ID
(均为数字),它们提供的上下文很少.我希望能够添加商品说明以及产品层次结构级别.
I am using the arules
package to find association rules in point-of-sale retail data. I am extracting transaction detail from a database, then placing in a transaction
object. I'm new to arules
and am trying to figure out how to populate the itemInfo
data frame in the transaction object. Right now, I'm just bringing in the transaction
and item ID
s (both numeric), which provide little context. I would like to be able to add an item description, as well as product hierarchy levels.
以下是我今天使用的流程:
Below is the process I'm using today:
-
数据通过数据库来自以下格式:
Data comes through from the database in the below format:
Transaction_ID Item_ID
-------------- -----------
100 1
100 2
100 3
101 2
101 3
102 1
102 2
要创建transaction
对象,请使用以下命令,如arules
文档中所述:
To create the transaction
object, I'm using the below command, as described in the arules
documentation:
txdata <- as(split(txdata[, "Item_ID"], txdata[, "Transaction_ID"]), "transactions")
注意:我发现我需要为Item_ID
提供一个数值,否则我会遇到使用字符串的主要性能问题(由于使用分解式字符串时拆分性能较差).
Note: I've found that I need to have a numeric value for the Item_ID
, otherwise I run into major performance issues using a string (due to poor performance of split when using factored strings).
创建并查看关联规则
rules <- apriori(txdata, parameter = list(support=0.00015, confidence=0.5))
inspect(head((sort(rules, by="confidence")), n=5))
当规则重新出现时,它们会按Item_ID
列出,这对我没有帮助.我希望能够通过ID
和/或描述来显示它们.另外,想利用arules
包中内置的聚合功能.
When the rules come back, they are listed by Item_ID
, which is not helpful to me. I want to be able to display them by the ID
and/or description. Also, would like to take advantage of the aggregation features built into the arules
package.
推荐答案
您可以使用itemInfo更改项目名称.这是一个示例:
You can change the names of items using itemInfo. Here is an example:
R> df <- data.frame(
TID = c(1,1,2,2,2,3),
item=c("a","b","a","b","c", "b")
)
R> trans <- as(split(df[,"item"], df[,"TID"]), "transactions")
### this is how you replace item labels and set a hierachy (here level1)
R> myLabels <- c("milk", "butter", "beer")
R> myLevel1 <- c("dairy", "dairy", "beverage")
R> itemInfo(trans) <- data.frame(labels = myLabels, level1 = myLevel1)
R> inspect(trans)
items transactionID
1 {milk,
butter} 1
2 {milk,
butter,
beer} 2
3 {butter} 3
### now you can use aggregate()
R> inspect(aggregate(trans, itemInfo(trans)[["level1"]]))
items transactionID
1 {dairy} 1
2 {beverage,
dairy} 2
3 {dairy} 3
您可以使用class? transactions
和? aggregate
找到更多信息.
You can find more info using class? transactions
and ? aggregate
.
希望这会有所帮助, 迈克尔
Hope this helps, Michael
这篇关于将项目信息添加到规则中的交易对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!