R,合并两个具有不同列名且不匹配的数据框 [英] R, Union two dataframe with different column names and no match
问题描述
我想将R中的两个不同的数据框(完全不同的列)组合成一个包含的数据框。
假设数据帧 df_a具有列A& B:
df_a<-read.table(header = TRUE,text ='
AB
1 x1
2 y1
3 z1
')
数据帧 df_b具有列C& D.
df_b<-read.table(header = TRUE,text ='
CD
1 6.7
1 4.5
1 3.7
2 3.3
2 4.1
2 5.2
')
因此,结果数据帧 df_c将具有A,B,C,D列,如下所示:
df_c
ABCD
1 1 x1 1 6.7
2 2 y1 1 4.5
3 3 z1 1 3.7
4不适用不适用2 3.3
5不适用不适用2 4.1
6不适用不适用2 5.2
方法1:
我首先尝试使用 rbind()
,但是该功能需要匹配的列名,但这不是我要查找的。 p>
方法2:
我使用了 df_c<-merge(df_a,df_b)
,但是merge似乎在做笛卡尔积,见下文:
df_c<-merge(df_a,df_b)
df_c
ABCD
1 1 x1 1 6.7
2 2 y1 1 6.7
3 3 z1 1 6.7
4 1 x1 1 4.5
5 2 y1 1 4.5
6 3 z1 1 4.5
7 1 x1 1 3.7
8 2 y1 1 3.7
9 3 z1 1 3.7
10 1 x1 2 3.3
11 2 y1 2 3.3
12 3 z1 2 3.3
13 1 x1 2 4.1
14 2 y1 2 4.1
15 3 z1 2 4.1
16 1 x1 2 5.2
17 2 y1 2 5.2
18 3 z1 2 5.2
方法3:
比起我使用 df_c<-union(df_a,df_b)
,但结果不再是数据帧。它变成了列表列表,如下所示:
[[1]]
[1] 1 2 3
[[2]]
[1] x1 y1 z1
级别:x1 y1 z1
[[3]]
[ 1] 1 1 1 2 2 2
[[4]]
[1] 6.7 4.5 3.7 3.3 4.1 5.2
方法#4
我创建了自己的函数 unionNoMatch()
,尝试将列从df_2追加到df_1输入参数:
unionNoMatch< -function(df_1,df_2)
{
df_3<-df_1;
代表(名称中的名称(df_2))
{
cbind(df_2 $ name,df_3)
}
return(df_3);
}
df_c<-unionNoMatch(df_a,df_b)
但是,我收到此错误:
data.frame(...,check.names = FALSE)中的错误:
参数暗示不同的行数:0、3
我如何实现将2个数据帧与不匹配的列组合到单个数据中的任务框架?
谢谢
似乎在尝试做一些可能不推荐的事情,但这是我在 data.table
中要做的事情:
library(data.table)#1.9.5 +获得[.data.table
setDT(df_a,keep.rownames = T);的on参数。 setDT(df_b,keep.rownames = T)
> df_a [df_b,on = rn]
rn ABCD
1:1 1 x1 1 6.7
2:2 2 y1 1 4.5
3:3 3 z1 1 3.7
4:4不适用NA 2 3.3
5:5不适用NA 2 4.1
6:6不适用NA 2 5.2
(基本上,我们找到要合并的东西,即行号,然后再合并)
I want to combine two distinct dataframes(completely different columns) in R, into one inclusive data frame.
Lets say data frame "df_a" has columns A & B:
df_a <- read.table(header=TRUE, text='
A B
1 x1
2 y1
3 z1
')
And dataframe "df_b" has columns C & D.
df_b <- read.table(header=TRUE, text='
C D
1 6.7
1 4.5
1 3.7
2 3.3
2 4.1
2 5.2
')
Therefore the resultant dataframe "df_c" will have columns A,B,C,D, see below:
df_c
A B C D
1 1 x1 1 6.7
2 2 y1 1 4.5
3 3 z1 1 3.7
4 NA NA 2 3.3
5 NA NA 2 4.1
6 NA NA 2 5.2
Approach #1:
I first tried using rbind()
but that function requires matching column names, however that is not what I'm looking for.
Approach #2:
I used df_c <- merge(df_a,df_b)
, however merge seems to be doing a Cartesian product, see below:
df_c <- merge(df_a,df_b)
df_c
A B C D
1 1 x1 1 6.7
2 2 y1 1 6.7
3 3 z1 1 6.7
4 1 x1 1 4.5
5 2 y1 1 4.5
6 3 z1 1 4.5
7 1 x1 1 3.7
8 2 y1 1 3.7
9 3 z1 1 3.7
10 1 x1 2 3.3
11 2 y1 2 3.3
12 3 z1 2 3.3
13 1 x1 2 4.1
14 2 y1 2 4.1
15 3 z1 2 4.1
16 1 x1 2 5.2
17 2 y1 2 5.2
18 3 z1 2 5.2
Approach #3:
Than I used df_c <- union(df_a,df_b)
, but the result is no longer a data frame. Its turned into a list of lists, see below:
[[1]]
[1] 1 2 3
[[2]]
[1] x1 y1 z1
Levels: x1 y1 z1
[[3]]
[1] 1 1 1 2 2 2
[[4]]
[1] 6.7 4.5 3.7 3.3 4.1 5.2
Approach #4
I created my own function called unionNoMatch()
, that attempts to append columns from df_2 to df_1 input paramters:
unionNoMatch <- function(df_1, df_2)
{
df_3 <- df_1;
for (name in names(df_2))
{
cbind(df_2$name,df_3)
}
return (df_3);
}
df_c <- unionNoMatch (df_a,df_b)
However, I got this error:
Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 3
How can I achieve my task of combining 2 data frames with non matching columns into a single data frame?
Thanks
Seems like you're trying to do something that's probably not recommended, but here's what I'd do in data.table
:
library(data.table) #1.9.5+ to get the on argument to [.data.table
setDT(df_a,keep.rownames=T); setDT(df_b,keep.rownames=T)
> df_a[df_b,on="rn"]
rn A B C D
1: 1 1 x1 1 6.7
2: 2 2 y1 1 4.5
3: 3 3 z1 1 3.7
4: 4 NA NA 2 3.3
5: 5 NA NA 2 4.1
6: 6 NA NA 2 5.2
(basically, we find something to merge on, namely the row number, then merge on that)
这篇关于R,合并两个具有不同列名且不匹配的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!