R数据表。 [英] R data.table subsetting on multiple conditions.

查看：95 发布时间：2017/3/12 11:54:59 r data.table

本文介绍了R数据表。的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用以下数据集，如何编写一个data.table调用，将该表子集，并返回该客户已购买SKU 1的所有客户ID和相关订单？

With the below data set, how do I write a data.table call that subsets this table and returns all customer ID's and associated orders for that customer IF that customer has ever purchased SKU 1?

预期结果应返回一个表，该表在该条件下排除cid 3和5，并为符合sku == 1的客户的每一行返回。

Expected result should return a table that excludes cid 3 and 5 on that condition and every row for customers matching sku==1.

因为我不知道如何写一个contains语句，==字面值只返回sku的匹配条件...我相信有一个更好的方法..

I am getting stuck as I don't know how to write a "contains" statement, == literal returns only sku's matching condition... I am sure there is a better way..

library("data.table")    
df<-data.frame(cid=c(1,1,1,1,1,2,2,2,2,2,3,4,5,5,6,6),
    order=c(1,1,1,2,3,4,4,4,5,5,6,7,8,8,9,9),
    sku=c(1,2,3,2,3,1,2,3,1,3,2,1,2,3,1,2))

    dt=as.data.table(df)

推荐答案

这类似于以前的答案，但这里的子集化工作在更<$ c

This is similar to a previous answer, but here the subsetting works in a more data.table like manner.

首先，让符合我们条件的cid：

First, lets take the cids that meet our condition:

match_cids = dt [sku == 1，cid]

$ c>％in％运算符允许我们仅过滤列表中包含的那些项。因此，使用上述：

the %in% operator allows us to filter to just those items that are contained in the list. so, using the above:

dt [cid％in％match_cids]

或在一行上：

> dt[cid %in% dt[sku==1, cid]]
     cid order sku
  1:   1     1   1
  2:   1     1   2
  3:   1     1   3
  4:   1     2   2
  5:   1     3   3
  6:   2     4   1
  7:   2     4   2
  8:   2     4   3
  9:   2     5   1
 10:   2     5   3
 11:   4     7   1
 12:   6     9   1
 13:   6     9   2

这篇关于R数据表。的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R数据表。 [英] R data.table subsetting on multiple conditions.

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R数据表。 [英] R data.table subsetting on multiple conditions.

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭