自加入R [英] Self Joining in R
本文介绍了自加入R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
以下是示例小标题:
test <- tibble(a = c("dd1","dd2","dd3","dd4","dd5"),
name = c("a", "b", "c", "d", "e"),
b = c("dd3","dd4","dd1","dd5","dd2"))
我想添加一个新列b_name作为自连接以使用以下方法进行测试:
And I want to add a new column b_name as self-join to test using:
dplyr::inner_join(test, test, by = c("a" = "b"))
我的表变大了(2.7M行有4列),并且出现以下错误:
My table is way to large (2.7M rows with 4 columns) and I get the following error:
错误:std :: bad_alloc
Error: std::bad_alloc
请告知正确的做法/最佳做法.
Please advise how to do it right / best practice.
我的最终目标是获得以下结构:
My final goal is to get the following structure:
a name b b_name
dd1 a dd3 c
dd2 b dd4 d
dd3 c dd1 a
dd4 d dd5 e
dd5 e dd2 b
推荐答案
另一个选项是 fastmatch
library(fastmatch)
test$b_name <- with(test, name[fmatch(b, a)])
test$b_name
#[1] "c" "d" "a" "e" "b"
根据?fmatch
描述
fmatch是内置match()函数的更快版本.
fmatch is a faster version of the built-in match() function.
这篇关于自加入R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文