R中带有data.table的Vlookup [英] Vlookup with data.table in R

查看:123
本文介绍了R中带有data.table的Vlookup的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,让我们从DataTable 1(DF1)开始:

First, let's start with DataTable 1 (DF1) :

          date id sales cost city  name
 1: 06/19/2016  1   149  101  MTL Bank1
 2: 06/20/2016  1   150  102  MTL Bank1
 3: 06/21/2016  1   151  104  MTL Bank1
 4: 06/22/2016  1   152  107  MTL Bank1
 5: 06/23/2016  1   155   99  MTL Bank1
 6: 06/19/2016  2    84   55   NY Bank2
 7: 06/20/2016  2    83   55   NY Bank2
 8: 06/21/2016  2    80   56   NY Bank2
 9: 06/22/2016  2    81   57   NY Bank2
10: 06/23/2016  2    97   58   NY Bank2

library(data.table)
DF1 <- data.table(c("06/19/2016", "06/20/2016", "06/21/2016", "06/22/2016", 
                    "06/23/2016", "06/19/2016", "06/20/2016", "06/21/2016",
                    "06/22/2016", "06/23/2016"),
                  c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2),
                  c(149, 150, 151, 152, 155, 84, 83, 80, 81, 97),
                  c(101, 102, 104, 107, 99, 55, 55, 56, 57, 58),
                  c("MTL", "MTL", "MTL", "MTL", "MTL", "NY", "NY", 
                    "NY", "NY", "NY"))
colnames(DF1) <- c("date", "id", "sales", "cost", "city")

我想使用查找表添加上面显示的列name:

I want to add the column name shown above using the lookup table:

   id  name start_date   end_date status
1:  1 Bank1 06/19/2016 06/25/2016      0
2:  2 Bank2 06/27/2016 06/27/2017      0
3:  3 Bank3 06/22/2016 06/24/2017      1
4:  4 Bank3 06/23/2016 12/23/2016      1

lookup <- data.table(c(1, 2, 3, 4),
                     c("Bank1", "Bank2", "Bank3", "Bank3"),
                     c("06/19/2016", "06/27/2016", "06/22/2016", "06/23/2016"),
                     c("06/25/2016", "06/27/2017", "06/24/2017", "12/23/2016"),
                     c("0", "0", "1", "1"))
colnames(lookup) <- c("id", "name", "start_date", "end_date", "status")

在这种情况下,我将使用ID查找名称.当我尝试merge时,DF1中总是有包含NA的新行.

In that case, I would use the id to find the name. When I try merge, I always have new rows in DF1 that contains NA.

推荐答案

DF1<-merge(DF1, lookup[,.(id, name)], by='id', all.x=TRUE, all.y=FALSE)

我认为您在这里寻找合并命令,但是您错过了all.y = FALSE位.我们在这里所做的是将DF1与查找数据表合并,并告诉R包括x中的所有行,但是只有y中的行与x中的行匹配,其中x是DF1并且y是查找. lookup [,.((id,name)]]表示我们只希望列ID(与DF1匹配)和列"name".如果DF1中存在在查询中没有匹配行的行,由于all.x = TRUE,它们将显示为NA.

I think the merge command is what you are looking for here, but you were missing the all.y = FALSE bit. What we are doing here is merging DF1 with the lookup data table and we are telling R to include all of the rows in x, but only the rows in y that match up with the rows in x where x is DF1 and y is lookup. The lookup[,.(id, name)] means we only want the column id (to match with DF1) and the column 'name'. If there were rows in DF1 that did not have a matching row in lookup, they would show up as NA's because of the all.x=TRUE.

这篇关于R中带有data.table的Vlookup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆