R编程:组合两个数据帧 [英] R programming: Combining Two Data Frames

查看:149
本文介绍了R编程:组合两个数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Folks,



如果您将2个数据帧df1和df2,我想连接或合并。我的目标就是制作一个新数据框,其列是df1和df2的列。



示例


$ b $产品= c(p1,p1,p1,p1,p1,p1,p1,p1 p2,p2,p2,p2,p2,p2,p2,p2,p3,p3,p3 ,p3,p3,p3,p4,p4,p4,p4,p4,p4,p4,p4 = c(b,b,b,b,a,a,a,a,b,b,b ,a,a,a,a,b,b,b,b,a,a b,b,b,b,a,a,a,a)
version = c(0.1,0.1,0.2,0.2,0.1,0.1 ,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2 ,0.2)
color = c(C1,C2,C1,C2,C1,C2,C1,C2 ,C1,C2,C1,C2,C1,C2,C1,C2 C1,C2,C1,C2,C1,C2,C1,C2,C1,C2)
price = c ,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27 ,28,29,30,31,32)

df1 = data.frame(product,skew,version)
df2 = data.frame(product,skew,color,price)

我希望得到如下结果。



我已经尝试了几个选项:

  #option 1 with cbind 
df< - cbind(df1,df2)

这将返回数据框重复的列产品和歪斜。

 #选项2,使用data.frame 
df< - data.frame(df1,df2)

这给了我很多我想要的,除了它有额外的列产品和歪斜。他们后缀为.1,所以没有重复。

 #选项3,使用合并,似乎要走的路要走
df< - merge(df1,df2)

我错过了一些合并,因为这实际上创建了一个联盟的所有数据集,共32个提供的128个观察结果。我想这是合并的工作原理。我已经运行了一个合并,并尝试了几个选项,但是无法让它吐出我想要的东西。



所以我的问题是:



如上所述,将所需数据帧从df1和df2中取出的最佳方式是什么?



提前Thx提供帮助!
Riad。

 商品偏差版本颜色价格
1 p1 b 0.1 C1 1
2 p1 b 0.1 C2 2
3 p1 b 0.2 C1 3
4 p1 b 0.2 C2 4
5 p1 a 0.1 C1 5
6 p1 a 0.1 C2 6
7 p1 a 0.2 C1 7
8 p1 a 0.2 C2 8
9 p2 b 0.1 C1 9
10 p2 b 0.1 C2 10
11 p2 b 0.2 C1 11
12 p2 b 0.2 C2 12
13 p2 a 0.1 C1 13
14 p2 a 0.1 C2 14
15 p2 a 0.2 C1 15
16 p2 a 0.2 C2 16
17 p3 b 0.1 C1 17
18 p3 b 0.1 C2 18
19 p3 b 0.2 C1 19
20 p3 b 0.2 C2 20
21 p3 a 0.1 C1 21
22 p3 a 0.1 C2 22
23 p3 a 0.2 C1 23
24 p3 a 0.2 C2 24
25 p4 b 0.1 C1 25
26 p4 b 0.1 C2 26
27 p4 b 0.2 C1 27
28 p4 b 0.2 C2 28
29 p4 a 0.1 C1 29
30 p4 a 0.1 C2 30
31 p4 a 0.2 C1 31
32 p4 a 0.2 C2 32


解决方案

merge()不按照你想要的方式工作,因为你的列产品和歪斜不是唯一的标识符。组合发生多次。所以merge()计算每个可能的组合。您可以添加第三列作为ID:

  product = c(p1,p1,p1 ,p1,p1,p1,p1,p1,p2,p2,p2,p2 p3,p3,p3,p3,p3,p3,p3,p3,p3,p3 ,p4,p4,p4,p4)
skew = c(b,b,b,b,a,a a,b,b,b,a,a,a,a,b,b ,b,a,a,a,a,b,b,b,b,a,a,a a)
version = c(0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2 ,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2)
color = c(C1,C2,C1,C2, C1,C2,C1,C2,C1,C2,C1,C2 ,C2,C1,C2,C1,C2,C1,C2,C1 C2,C1,C2)
price = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 ,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32)
id = 1:32

df1 = data.frame(product,skew,id,version)
df2 = data.frame(product,skew,id,color,price)
merge(df1,df2)

或者你合并您的data.frames手动:

  cbind(df1,df2 [,3:4])


Folks,

I would like to concatenate or merge if you will 2 data frames df1 and df2. My goal is as simple as making a new data frame whose columns is a union of those of df1 and df2.

Example

product=c("p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p3","p3","p3","p3","p3","p3","p3","p3","p4","p4","p4","p4","p4","p4","p4","p4")
skew=c("b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a")
version=c(0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2)
color=c("C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2")
price=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32)

df1 = data.frame(product, skew, version)
df2 = data.frame(product, skew, color, price)

My desire is to get the results as below.

I have tried a few options:

#option 1 with cbind
df <- cbind(df1,df2)

This returns a dataframe duplicated columns "product" and "skew".

# Option 2, use data.frame
df <- data.frame(df1,df2)

This gave me pretty much what I want, except that it had extra columns for "product" and "skew". They are suffixed with a ".1" though, so there is no duplicaton.

# option 3, use merge which seems to be the way to go
df <- merge(df1,df2) 

I think I am missing something with merge because this has actually created a union out of all the data set, making a total of 128 observations out of the 32 provided. I guess that's how merge works. I have run a "?merge" and tried a few options but could not get it to spit what I want.

So my question is:

What is the best way of getting my desired dataframe out of the df1 and df2 as above ?

Thx in advance for your help ! Riad.

     product skew  version color price
1       p1    b     0.1    C1     1
2       p1    b     0.1    C2     2
3       p1    b     0.2    C1     3
4       p1    b     0.2    C2     4
5       p1    a     0.1    C1     5
6       p1    a     0.1    C2     6
7       p1    a     0.2    C1     7
8       p1    a     0.2    C2     8
9       p2    b     0.1    C1     9
10      p2    b     0.1    C2    10
11      p2    b     0.2    C1    11
12      p2    b     0.2    C2    12
13      p2    a     0.1    C1    13
14      p2    a     0.1    C2    14
15      p2    a     0.2    C1    15
16      p2    a     0.2    C2    16
17      p3    b     0.1    C1    17
18      p3    b     0.1    C2    18
19      p3    b     0.2    C1    19
20      p3    b     0.2    C2    20
21      p3    a     0.1    C1    21
22      p3    a     0.1    C2    22
23      p3    a     0.2    C1    23
24      p3    a     0.2    C2    24
25      p4    b     0.1    C1    25
26      p4    b     0.1    C2    26
27      p4    b     0.2    C1    27
28      p4    b     0.2    C2    28
29      p4    a     0.1    C1    29
30      p4    a     0.1    C2    30
31      p4    a     0.2    C1    31
32      p4    a     0.2    C2    32

解决方案

merge() does not work the way you want because your columns "product" and "skew" are no unique identifiers. The combinations occur multiple times. So merge() computes each possible combination. You can either include a third column as an id:

product=c("p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p3","p3","p3","p3","p3","p3","p3","p3","p4","p4","p4","p4","p4","p4","p4","p4")
skew=c("b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a")
version=c(0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2)
color=c("C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2")
price=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32)
id = 1:32

df1 = data.frame(product, skew, id, version)
df2 = data.frame(product, skew, id, color, price)
merge(df1, df2)

Or you merge your data.frames manually:

cbind(df1, df2[, 3:4])

这篇关于R编程:组合两个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆