根据R中的条件合并2个数据帧 [英] Merge 2 dataframes based on condition in R
问题描述
我有以下两个要合并的数据框:
I have the following 2 data frames that I want to merge:
x <- data.frame(a= 1:11, b =3:13, c=2:12, d=7:17, invoice = 1:11)
x =
a b c d invoice
1 3 2 7 1
2 4 3 8 2
3 5 4 9 3
4 6 5 10 4
5 7 6 11 5
6 8 7 12 6
7 9 8 13 7
8 10 9 14 8
9 11 10 15 9
10 12 11 16 10
11 13 12 17 11
y <- data.frame(nr = 100:125, invoice = 1)
y$invoice[12:26] <- 2
> y
nr invoice
100 1
101 1
102 1
103 1
104 1
105 1
106 1
107 1
108 1
109 1
110 1
111 2
112 2
113 2
114 2
115 2
116 2
117 2
我想在发票编号相同时将数据框X中的字母与数据框Y中的字母合并.它应该首先合并字母A中的值,然后再合并B ect中的值.这应该一直发生,直到发票编号不再相同为止,然后从发票编号2中选择编号.
I want to merge the letters from dataframe X with dataframe Y when the invoice number is the same. It should start with merging the value from letter A, then B ect. This should be happening until the invoice number is not the same anymore and then choose the numbers from invoice nr 2.
输出应如下所示:
> output
nr invoice letter_count
100 1 1
101 1 3
102 1 2
103 1 7
104 1 1
105 1 3
106 1 2
107 1 7
108 1 1
109 1 2
110 1 7
111 2 2
112 2 4
113 2 3
114 2 8
115 2 2
116 2 4
我尝试将merge
函数与by
参数一起使用,但是这会产生一个错误,即行数不相同.任何帮助,我将不胜感激.
I tried to use the merge
function with the by
argument but this created an error that the number of rows is not the same. Any help I will appreciate.
推荐答案
以下是使用purrr
软件包的解决方案.
Here is a solution using the purrr
package.
# Prepare the data frames
x <- data.frame(a = 1:11, b = 3:13, c = 2:12, d = 7:17, invoice = 1:11)
y <- data.frame(nr = 100:125, invoice = 1)
y$invoice[12:26] <- 2
# Load package
library(purrr)
# Split the data based on invoice
y_list <- split(y, f = y$invoice)
# Design a function to transfer data
trans_fun <- function(main_df, letter_df = x){
# Get the invoice number
temp_num<- unique(main_df$invoice)
# Extract letter_count information from x
add_vec <- unlist(letter_df[letter_df$invoice == temp_num, 1:4])
# Get the remainder of nrow(main_df) and length(add_vec)
reamin_num <- nrow(main_df) %% length(add_vec)
# Get the multiple of nrow(main_df) and length(add_vec)
multiple_num <- nrow(main_df) %/% length(add_vec)
# Create the entire sequence to add
add_seq <- rep(add_vec, multiple_num + 1)
add_seq2 <- add_seq[1:(length(add_seq) - (length(add_vec) - reamin_num))]
# Add new column, add_seq2, to main_df
main_df$letter_count <- add_seq2
return(main_df)
}
# Apply the trans_fun function using map_df
output <- map_df(y_list, .f = trans_fun)
# See the result
output
nr invoice letter_count
1 100 1 1
2 101 1 3
3 102 1 2
4 103 1 7
5 104 1 1
6 105 1 3
7 106 1 2
8 107 1 7
9 108 1 1
10 109 1 3
11 110 1 2
12 111 2 2
13 112 2 4
14 113 2 3
15 114 2 8
16 115 2 2
17 116 2 4
18 117 2 3
19 118 2 8
20 119 2 2
21 120 2 4
22 121 2 3
23 122 2 8
24 123 2 2
25 124 2 4
26 125 2 3
这篇关于根据R中的条件合并2个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!