使用 ifelse() 通过引用不同长度的另一个数据帧来替换一个数据帧中的 NA [英] Using ifelse() to replace NAs in one data frame by referencing another data frame of different length

查看:31
本文介绍了使用 ifelse() 通过引用不同长度的另一个数据帧来替换一个数据帧中的 NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经查看了以下两个帖子,并认为它们可能回答我的问题,尽管我正在努力了解如何:

I already reviewed the following two posts and think they might answer my question, although I'm struggling to see how:

1) 有条件地替换 data.frame 中的值2) 创建一个用来自另一个数据帧的值替换来自另一个数据帧的 NA 的函数

话虽如此,我正在尝试通过引用不同(较短)长度的另一个数据框并从B"列中提取替换值来替换一个数据框中的 NA,其中每个A"列的值数据框匹配.

With that said, I'm trying to replace NAs in one data frame by referencing another data frame of a different (shorter) length and pulling in replacement values from column "B" where the values for column "A" in each data frame match.

为了简单和说明,我修改了下面的数据,尽管实际数据中的概念是相同的.仅供参考,在真正的第二个数据框中,A"列中也没有重复项.

I've modified the data, below, for simplicity and illustration, although the concept is the same in the actual data. FYI, in the real second data frame, there are also no duplicates in column "A".

这是第一个数据框 (df1):

Here's the first data frame (df1):

> df1
    B          C  A
1  NA 2012-10-01  0
2  NA 2012-10-01  5
3   4 2012-10-01 10
4  NA 2012-10-01 15
5  NA 2012-10-01 20
6  20 2012-10-01 25
7  NA 2012-10-01  0
8  NA 2012-10-01  5
9   5 2012-10-01 10
10  5 2012-10-01 15

> str(df1)
'data.frame':   10 obs. of  3 variables:
 $ B: num  NA NA 4 NA NA 20 NA NA 5 5
 $ C: Factor w/ 1 level "2012-10-01": 1 1 1 1 1 1 1 1 1 1
 $ A: num  0 5 10 15 20 25 0 5 10 15

还有第二个数据框(df2).

And the second data frame (df2).

> df2
   A         B
1  0 1.7169811
2  5 0.3396226
3 10 0.1320755
4 15 0.1509434
5 20 0.0754717
6 25 2.0943396

> str(df2)
'data.frame':   6 obs. of  2 variables:
 $ A: int  0 5 10 15 20 25
 $ B: num  1.717 0.3396 0.1321 0.1509 0.0755 ...

我认为我非常接近以下代码:

I think I'm pretty close with the following code:

> ifelse(is.na(df1$B) == TRUE, df2$B[df2$A == df1$A], df1$B)
 [1]  1.7169811  0.3396226  4.0000000  0.1509434  0.0754717 20.0000000         NA         NA
 [9]  5.0000000  5.0000000
Warning message:
In df2$A == df1$A :
  longer object length is not a multiple of shorter object length

显然,我希望第 7 个和第 8 个输出元素是 1.7169811 和 0.3396226,而不是 NAs ...

Obviously, I want the 7th and 8th output elements to be 1.7169811 and 0.3396226, rather than NAs . . .

在此先感谢您的帮助,再次感谢您的耐心等待!

Thanks, in advance, for any help, and, once again, thanks for your patience!

推荐答案

试试下面的代码代码>功能:

Try the following code which takes your original statement and makes a small tweak in the TRUE argument of the ifelse function:

> df1$B <- ifelse(is.na(df1$B) == TRUE, df2$B[df2$A %in% df1$A], df1$B)   
#                         Switched '==' to '%in%' ---^
> df1
            B          C  A
1   1.7169811 2012-10-01  0
2   0.3396226 2012-10-01  5
3   4.0000000 2012-10-01 10
4   0.1509434 2012-10-01 15
5   0.0754717 2012-10-01 20
6  20.0000000 2012-10-01 25
7   1.7169811 2012-10-01  0
8   0.3396226 2012-10-01  5
9   5.0000000 2012-10-01 10
10  5.0000000 2012-10-01 15

这篇关于使用 ifelse() 通过引用不同长度的另一个数据帧来替换一个数据帧中的 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆