Tidyverse:使用最新的非NA值替换NA * 使用Tidyverse工具 [英] Tidyverse: Replacing NAs with latest non-NA values using tidyverse tools

查看：86 发布时间：2021/5/2 20:57:16 r dplyr tidyverse coalesce

本文介绍了Tidyverse:使用最新的非NA值替换NA * *使用Tidyverse工具*的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在使用 zoo :: 和 data.table :: 之前，我的问题已得到解答.我很好奇tidyverse/dplyr的最佳解决方案是什么.

My question has been answered before using zoo:: and data.table::; I'm curious as to what the best solution with tidyverse/dplyr would be.

以前的答案(非tidyverse): R中的前向和后向填充数据帧用最新的非NA值替换NA

Previous answers (non-tidyverse): Forward and backward fill data frame in R Replacing NAs with latest non-NA value

我的数据看起来像这样，每个国家(美国，澳大利亚)最早的两年(2015年，2016年)缺少数据(底部输入数据的代码):

My data looks like this, where the earliest two years (2015, 2016) in each country (usa, aus) have missing data (code for data input at the bottom):

#>   country year value
#> 1     usa 2015    NA
#> 2     usa 2016    NA
#> 3     usa 2017   100
#> 4     usa 2018    NA
#> 5     aus 2015    NA
#> 6     aus 2016    NA
#> 7     aus 2017    50
#> 8     aus 2018    60

我想用2017年的可用值填充每个国家/地区中的缺失值.

I would like to fill the missing values, within each country, with the value available in 2017.

我希望该填充仅适用于2017年之前的年份-因此2018年的NA不应该使用任何填充.应该保持不适用.

I would like that fill to only be for the years prior to 2017--so an NA in 2018 should not be filled in by anything. It should remain NA.

所以我想要的输出是:

#>   country year value
#> 1     usa 2015   100
#> 2     usa 2016   100
#> 3     usa 2017   100
#> 4     usa 2018    NA
#> 5     aus 2015    50
#> 6     aus 2016    50
#> 7     aus 2017    50
#> 8     aus 2018    60

我尝试了 group_by(country)，然后怀疑我打算使用 coalesce()，但是我通常在整个范围内使用 coalesce 向量，而不是沿着向量.

I tried group_by(country) and then I suspect I'm meant to use coalesce(), but I normally use coalesce across vectors, not along them.

library(tidyverse)
df %>% group_by(country) %>%

使用tidyverse工具最简单的方法是什么?

What's the easiest way to do this using tidyverse tools?

#install.packages("datapasta")
df <- data.frame(
  stringsAsFactors = FALSE,
           country = c("usa", "usa", "usa", "usa", "aus", "aus", "aus", "aus"),
              year = c(2015L, 2016L, 2017L, 2018L, 2015L, 2016L, 2017L, 2018L),
             value = c(NA, NA, 100L, NA, NA, NA, 50L, 60L)
)
df

推荐答案

我们可以在2017年之前 NA 替换 NA ，并在2017年中为每个国家.

We can replace the NAs before 2017 with value available in 2017 year for each country.

library(dplyr)

df %>% 
  group_by(country) %>% 
  mutate(value = replace(value, is.na(value) & year < 2017, value[year == 2017]))
  #Similarly with ifelse
  #mutate(value = ifelse(is.na(value) & year < 2017, value[year == 2017], value))

#  country  year value
#  <chr>   <int> <int>
#1 usa      2015   100
#2 usa      2016   100
#3 usa      2017   100
#4 usa      2018    NA
#5 aus      2015    50
#6 aus      2016    50
#7 aus      2017    50
#8 aus      2018    60

这篇关于Tidyverse:使用最新的非NA值替换NA * *使用Tidyverse工具*的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tidyverse:使用最新的非NA值替换NA * 使用Tidyverse工具 [英] Tidyverse: Replacing NAs with latest non-NA values using tidyverse tools

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Tidyverse:使用最新的非NA值替换NA * *使用Tidyverse工具* [英] Tidyverse: Replacing NAs with latest non-NA values *using tidyverse tools*

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

Tidyverse:使用最新的非NA值替换NA * 使用Tidyverse工具 [英] Tidyverse: Replacing NAs with latest non-NA values using tidyverse tools

登录关闭