R中的混合合并-下标解决方案? [英] Mixed Merge in R - Subscript solution?

查看:72
本文介绍了R中的混合合并-下标解决方案?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意: 我更改了首次发布时的示例.我的第一个示例过于简化,无法捕捉到真正的问题.

我有两个数据帧,它们在一列中排序不同.我想匹配一列,然后合并第二列中的值.第二列需要保持相同的顺序.

I have two data frames which are sorted differently in one column. I want to match one column and then merge in the value from the second column. The second column needs to stay in the same order.

所以我有这个:

state<-c("IA","IA","IA","IL","IL","IL")
value1<-c(1,2,3,4,5,6)
s1<-data.frame(state,value1)
state<-c("IL","IL","IL","IA","IA","IA")
value2<-c(3,4,5,6,7,8)
s2<-data.frame(state,value2)

s1
s2

返回以下内容:

> s1
  state value1
1    IA      1
2    IA      2
3    IA      3
4    IL      4
5    IL      5
6    IL      6
> s2
  state value2
1    IL      3
2    IL      4
3    IL      5
4    IA      6
5    IA      7
6    IA      8

我想要这个:

  state value1 value2
1    IA      1      6
2    IA      2      7
3    IA      3      8
4    IL      4      3
5    IL      5      4
6    IL      6      5

我要愚蠢地试图解决这个问题.似乎应该是一个简单的下标问题.

I'm about to drive myself silly trying to solve this. Seems like it should be a simple subscript problem.

推荐答案

有几种方法可以做到这一点(毕竟它是R),但我认为最清楚的是创建索引.我们需要一个函数来创建一个顺序索引(从一个索引开始,以观察数结尾).

There are several ways to do this (it is R, after all) but I think the most clear is creating an index. We need a function that creates a sequential index (starting at one and ending with the number of observations).

seq_len(3) 
> [1] 1 2 3

但是我们需要在每个分组变量(状态)中计算该索引.为此,我们可以使用R的ave函数.它以数字作为第一个参数,然后是分组因子,最后是要在每个组中应用的函数.

But we need to calculate this index within each grouping variable (state). For this we can use R's ave function. It takes a numeric as the first argument, then the grouping factors, and finally the function to be applied in each group.

s1$index <- with(s1,ave(value1,state,FUN=seq_len))
s2$index <- with(s2,ave(value2,state,FUN=seq_len))

(请注意使用with,它告诉R在环境/数据框内搜索变量.与使用s1 $ value1,s2 $ value2等相比,这是一种更好的做法)

(Note the use of with, which tells R to search for the variables within the environment/dataframe. This is better practice than using s1$value1, s2$value2, etc.)

现在,我们可以简单地合并(合并)两个数据帧(通过两个数据帧中存在的变量:状态和索引).

Now we can simply merge (join) the two data frames (by the variables present in the both data frames: state and index).

merge(s1,s2)

给出

   state index value1 value2
1    IA     1      1      6
2    IA     2      2      7
3    IA     3      3      8
4    IL     1      4      3
5    IL     2      5      4
6    IL     3      6      5

要执行此操作,每个数据帧中的状态观察次数应相同.

For this to work, there should be the same number of observations by state in each of the data frames.

这篇关于R中的混合合并-下标解决方案?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆