从r中的2个相关数据帧进行子集和合并 [英] Subsetting and Merging from 2 Related Data Frames in r

查看:151
本文介绍了从r中的2个相关数据帧进行子集和合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经通过档案搜索并且无法利用这个问题,我涉及到2个相关数据帧的子集,一个数据帧是关键,另一个是一个年度列表,我想使用密钥创建子集和索引。我已经尝试使用子公式,但我的代码不符合我的标准。以下是数据:

I have searched through the archives and to no avail on this problem I have involving the subsetting of 2 related data frames, one data frame is a key, the other is an annual list, I'd like to use the key to create a subset and an index. I have tried using the subset formula's but my code is not appropriately meeting my criteria. Here is the data:

players <- c('Albert Belle','Reggie Jackson', 'Reggie Jackson')
contract_start_season <- c(1999,1977,1982)
contract_end_season <- c(2003, 1981, 1985)
key <- data.frame (player = players, contract_start_season, contract_end_season)
player_data <- data.frame( season = c(seq(1975,1985),seq(1997,2003)), player = c(rep('Reggie Jackson',times=11),rep('Albert Belle', times=7)))

我想使用键来子集玩家数据到那些年份,所以对于杰克逊1977年至1981年,然后是1982年至1985年,以及阿尔伯特·贝尔1999年至2003年。我还想创建一个索引,例如雷吉杰克逊1977年将是1978年第2年的1年。 ..

I want to use the key to subset the player data to those years, so for Jackson 1977 to 1981 and then 1982 to 1985 and for Albert Belle 1999 to 2003. I'd also like to create an index so for example Reggie Jackson 1977 would be year 1, 1978 year 2 ect...

我未尝试合并的代码看起来像这样,它不起作用:

The code I have tried without merging looks like this and it isn't working:

player_data[player_data$season >= key$contract_start_season&player_data$season <= key$contract_end_season,]

因为雷吉杰克逊有两个不同的合同年份,我也遇到合并问题正在尝试合并两者。

I am also running into problems when merging because Reggie Jackson has 2 different contract years and it is trying to merge both.

对此的任何帮助或建议将被超级赞赏。

Any help or advice on this would be super appreciated.

推荐答案

您是否尝试沿着以下几行进行某些操作?

Are you trying to do something along the following lines?

library(data.table)

key <- data.table(key)
player_data <- data.table(player_data)

#Adding another column called season to help in the merge later
key[,season := contract_start_season]

# Index on which to merge
setkeyv(key, c("player","season"))
setkeyv(player_data, c("player","season"))

#the roll = Inf makes it like a closest merge, instead of an exact merge
key[player_data, roll = Inf]

输出:

> key[player_data, roll = Inf]
            player season contract_start_season contract_end_season
 1:   Albert Belle   1997                    NA                  NA
 2:   Albert Belle   1998                    NA                  NA
 3:   Albert Belle   1999                  1999                2003
 4:   Albert Belle   2000                  1999                2003
 5:   Albert Belle   2001                  1999                2003
 6:   Albert Belle   2002                  1999                2003
 7:   Albert Belle   2003                  1999                2003
 8: Reggie Jackson   1975                    NA                  NA
 9: Reggie Jackson   1976                    NA                  NA
10: Reggie Jackson   1977                  1977                1981
11: Reggie Jackson   1978                  1977                1981
12: Reggie Jackson   1979                  1977                1981
13: Reggie Jackson   1980                  1977                1981
14: Reggie Jackson   1981                  1977                1981
15: Reggie Jackson   1982                  1982                1985
16: Reggie Jackson   1983                  1982                1985
17: Reggie Jackson   1984                  1982                1985
18: Reggie Jackson   1985                  1982                1985

这篇关于从r中的2个相关数据帧进行子集和合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆