在dplyr中使用约 [英] Using approx in dplyr
问题描述
我正在为年
之间的数据框中的每个 id
code> X 。 dplyr
似乎是一个合适的选项,但由于出现错误,我无法使其正常工作:
I'm trying to do a linear approximation for each id
in the data frame between year
using point x
. dplyr
seems like an appropriate option for this, but I can't get it to work because of an error:
错误:不兼容大小(9),期望3(组大小)或1
Error: incompatible size (9), expecting 3 (the group size) or 1
示例代码:
library(dplyr)
dat <- data.frame(id = c(1,1,1,2,2,2,3,3,3), year = c(1,2,3,1,2,3,1,2,3), x = c(1,NA,2, 3, NA, 4, 5, NA, 6))
# Linear Interpolation
dat %>%
group_by(id) %>%
mutate(x2 = as.numeric(unlist(approx(x = dat$year, y = dat$x, xout = dat$x)[2])))
样本数据:
id year x
1 1 1 1
2 1 2 NA
3 1 3 2
4 2 1 3
5 2 2 NA
6 2 3 4
7 3 1 5
8 3 2 NA
9 3 3 6
推荐答案
p>这里有几种方法(从评论转移):
Here are a couple of approaches (transferred from comments):
1)na .approx / ave
library(zoo)
transform(dat, x2 = ave(x, id, FUN = na.approx))
年份为1,2,3我们不需要指定它,但是如果需要的话:
With year being 1, 2, 3 we did not not need to specify it but if this were needed then:
nr <- nrow(dat)
transform(dat, x2 = ave(1:nr, id, FUN = function(i) with(dat[i, ], na.approx(x, year))))
2)na.approx / dplyr
library(dplyr)
library(zoo)
dat %>%
group_by(id) %>%
mutate(x2 = na.approx(x, year)) %>%
ungroup()
如果不需要年份,则省略第二个参数到 na.approx
。
If year is not needed then omit the second argument to na.approx
.
注意: / strong>动物园还有其他NA填充功能,特别是 na.spline
和 na.locf
。
这篇关于在dplyr中使用约的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!