将项目信息添加到规则中的交易对象 [英] Adding item information to transaction object in arules

查看:54
本文介绍了将项目信息添加到规则中的交易对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用arules包在销售点零售数据中查找关联规则.我正在从数据库中提取交易明细,然后放置在transaction对象中.我是arules的新手,正在尝试弄清楚如何在事务对象中填充itemInfo数据框.现在,我只是引入transactionitem ID(均为数字),它们提供的上下文很少.我希望能够添加商品说明以及产品层次结构级别.

I am using the arules package to find association rules in point-of-sale retail data. I am extracting transaction detail from a database, then placing in a transaction object. I'm new to arules and am trying to figure out how to populate the itemInfo data frame in the transaction object. Right now, I'm just bringing in the transaction and item IDs (both numeric), which provide little context. I would like to be able to add an item description, as well as product hierarchy levels.

以下是我今天使用的流程:

Below is the process I'm using today:

  1. 数据通过数据库来自以下格式:

  1. Data comes through from the database in the below format:

Transaction_ID     Item_ID
--------------     ----------- 
100                1
100                2
100                3
101                2
101                3
102                1
102                2

  • 要创建transaction对象,请使用以下命令,如arules文档中所述:

  • To create the transaction object, I'm using the below command, as described in the arules documentation:

    txdata <- as(split(txdata[, "Item_ID"], txdata[, "Transaction_ID"]), "transactions")
    

    注意:我发现我需要为Item_ID提供一个数值,否则我会遇到使用字符串的主要性能问题(由于使用分解式字符串时拆分性能较差).

    Note: I've found that I need to have a numeric value for the Item_ID, otherwise I run into major performance issues using a string (due to poor performance of split when using factored strings).

    创建并查看关联规则

    rules <- apriori(txdata, parameter = list(support=0.00015, confidence=0.5))
    inspect(head((sort(rules, by="confidence")), n=5))
    

  • 当规则重新出现时,它们会按Item_ID列出,这对我没有帮助.我希望能够通过ID和/或描述来显示它们.另外,想利用arules包中内置的聚合功能.

    When the rules come back, they are listed by Item_ID, which is not helpful to me. I want to be able to display them by the ID and/or description. Also, would like to take advantage of the aggregation features built into the arules package.

    推荐答案

    您可以使用itemInfo更改项目名称.这是一个示例:

    You can change the names of items using itemInfo. Here is an example:

    R> df <- data.frame(
       TID = c(1,1,2,2,2,3), 
       item=c("a","b","a","b","c", "b")
     )
    R> trans <- as(split(df[,"item"], df[,"TID"]), "transactions")
    
    ### this is how you replace item labels and set a hierachy (here level1)
    R> myLabels <- c("milk", "butter", "beer")
    R> myLevel1 <- c("dairy", "dairy", "beverage")
    R> itemInfo(trans) <- data.frame(labels = myLabels, level1 = myLevel1)
    
    R> inspect(trans)
         items    transactionID
      1 {milk,                
         butter}             1
      2 {milk,                
         butter,              
         beer}               2
      3 {butter}             3
    
     ### now you can use aggregate()
     R> inspect(aggregate(trans, itemInfo(trans)[["level1"]]))
         items      transactionID
      1 {dairy}                1
      2 {beverage,              
         dairy}                2
      3 {dairy}                3
    

    您可以使用class? transactions? aggregate找到更多信息.

    You can find more info using class? transactions and ? aggregate.

    希望这会有所帮助, 迈克尔

    Hope this helps, Michael

    这篇关于将项目信息添加到规则中的交易对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆