如何从R中的一个数据帧到现有数据帧重新绑定新行 [英] How to rbind new rows from one data frame to an existing data frame in R

查看:66
本文介绍了如何从R中的一个数据帧到现有数据帧重新绑定新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如何基于每个表中的唯一值将新数据(行)从一个数据帧df2附加到现有数据帧df1.因此,我有一个现有的数据框df1,其中包含历史数据,并且每一行都有一个唯一值.然后,我从Web提取数据,并将其放入新的数据帧df2中.新数据帧还包含一个唯一值,该值可能与df1中的唯一值匹配,也可能不匹配.

I would like to know how to append new data (rows) from one data frame, df2, to an existing data frame, df1, based on a unique value in each table. So I have an existing data frame, df1, that has historical data and each row has a unique value. I then pull data from the web and put it into a new data frame, df2. The new data frame also includes a unique value which may or may not match a unique value in df1.

我想获取df2中所有具有唯一值(在df1中不存在)的行,并将这些行附加到df1中.我最初的想法是使用与此类似的代码:

I would like to take all rows in df2 that have a unique value that does not exist in df1, and append those rows to df1. My initial thoughts were to use code similar to this:

ifelse(any(df1 $ unique_val == df2 $ unique_val),df1 <-df1,df1 <-rbind(df2,df1))

但是后来我意识到,与任何"比赛相比,我需要更多的一对一比赛.我知道如何在SQL中使用UNION和WHERE子句来执行此操作,但是我不确定如何使其在R中工作.我可以找到的唯一相关研究项目是将所有数据添加到两个数据帧中,或者添加一个新的列到现有数据框.

But then I realized that I need a more one-to-one match than an "any" match. I know how I would do this in SQL with a UNION and WHERE clause, but I'm not sure how to make it work in R. The only related items I could find researching were appending all data from two data frames or appending a new column to an existing data frame.

以下示例显示了我正在寻找的内容以及为什么我不希望联接"这两个数据帧"

The following example shows what I am looking for and why I am not looking to "join" these two data frames"

df1 = data.frame(numb = c(1:6),rand = c(rep("Toaster",6)))

df1 $ unique_val<-paste0(df1 $ numb,df1 $ rand)

>df1麻木兰特unique_val1 1烤面包机12 2烤面包机23 3烤面包机3烤面包机4 4烤面包机4烤面包机5 5烤面包机5烤面包机6 6烤面包机6烤面包机

df2 = data.frame(numb = c(5:7),rand = c(rep("Toaster",2),c(rep("Radio",1))))

df2 $ unique_val<-paste0(df2 $ numb,df2 $ rand)

>df2麻木兰特unique_val1 5烤面包机5烤面包机2 6烤面包机6烤面包机3 7收音机7收音机

如您所见,df2中的第3行是唯一的新行(在df1中没有匹配的unique_val的行).我想将此新行添加到df1.注意:df2中的新行并不总是与之相同.

As you can see, row 3 in df2 is the only new row (a row that does not have a matching unique_val in df1). I would like to add this new row to df1. Note: it's not always the same row that is new in df2.

我使用了这篇帖子中的每个联接,合并/合并数据帧,如下所示:

I used each of the joins from this post, merge/join data frames as follows:

merge(df1,df2,by ="unique_val")

merge(df1,df2,by ="unique_val",全部= TRUE)

merge(df1,df2,by ="unique_val",all.x = TRUE)

merge(df1,df2,by ="unique_val",all.y = TRUE)

我还尝试了dplyr的anti_join:

I also tried the anti_join from dplyr:

anti_join(df1,df2,by ="unique_val")

Rbind给我以下内容:

Rbind gives me the following:

rbind(df1,df2)麻木兰特浓1 1烤面包机12 2烤面包机23 3烤面包机3烤面包机4 4烤面包机4烤面包机5 5烤面包机5烤面包机6 6烤面包机6烤面包机7 5烤面包机5烤面包机8 6烤面包机6烤面包机9 7收音机7收音机

没有一个能给我以下的期望输出:

None of which give me the desired output of the following:

<代码>麻木兰特浓1 1烤面包机12 2烤面包机23 3烤面包机3烤面包机4 4烤面包机4烤面包机5 5烤面包机5烤面包机6 6烤面包机6烤面包机7 7收音机7收音机

我正在寻找这些数据框,而不是加入它们.

I'm looking to rbind these data frames, not join them.

推荐答案

我们可以使用 data.table 中的 rbindlist/unique .我们将数据集放置在 list 中,使用 rbindlist (来自 data.table )将数据集保留在 list 到一个单独的 data.table 并从 data.table 获取具有 unique unique 行,该行还具有 by 选项以指定变量.

We can use rbindlist/unique from data.table. We place the datasets in a list, use rbindlist (from data.table) to rbind the datasets in the list to a single data.table and get the unique rows with unique from data.table which also has the by option to specify the variable.

library(data.table)
unique(rbindlist(list(df1, df2)), by = "numb")
#   numb    rand unique_val
#1:    1 Toaster   1Toaster
#2:    2 Toaster   2Toaster
#3:    3 Toaster   3Toaster
#4:    4 Toaster   4Toaster
#5:    5 Toaster   5Toaster
#6:    6 Toaster   6Toaster
#7:    7   Radio     7Radio

这篇关于如何从R中的一个数据帧到现有数据帧重新绑定新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆