在R中使用线性回归填充NA [英] Filling NA using linear regression in R
本文介绍了在R中使用线性回归填充NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个带有一个时间列和2个变量的数据.(下面的示例)
I have a data with one time column and 2 variables.(example below)
df <- structure(list(time = c(15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26), var1 = c(20.4, 31.5, NA, 53.7, 64.8, NA, NA, NA, NA,
120.3, NA, 142.5), var2 = c(30.6, 47.25, 63.9, 80.55, 97.2, 113.85,
130.5, 147.15, 163.8, 180.45, 197.1, 213.75)), .Names = c("time",
"var1", "var2"), row.names = c(NA, -12L), class = c("tbl_df",
"tbl", "data.frame"))
var1 的 NA 很少,我想用 var1 和 var2 中剩余值之间的线性回归填充 NA.
The var1 has few NA and I want to fill the NA with linear regression between remaining values in var1 and var2.
请帮助!!如果您需要更多信息,请告诉我
Please Help!! And let me know if you need more information
推荐答案
以下是使用 lm
预测R中值的示例.
Here is an example using lm
to predict values in R.
library(dplyr)
# Construct linear model based on non-NA pairs
df2 <- df %>% filter(!is.na(var1))
fit <- lm(var1 ~ var2, data = df2)
# See the result
summary(fit)
# Call:
# lm(formula = var1 ~ var2, data = df2)
#
# Residuals:
# 1 2 3 4 5 6
# 8.627e-15 -2.388e-15 1.546e-16 -9.658e-15 -2.322e-15 5.587e-15
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 2.321e-14 5.619e-15 4.130e+00 0.0145 *
# var2 6.667e-01 4.411e-17 1.511e+16 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 7.246e-15 on 4 degrees of freedom
# Multiple R-squared: 1, Adjusted R-squared: 1
# F-statistic: 2.284e+32 on 1 and 4 DF, p-value: < 2.2e-16
#
# Warning message:
# In summary.lm(fit) : essentially perfect fit: summary may be unreliable
# Use fit to predict the value
df3 <- df %>%
mutate(pred = predict(fit, .)) %>%
# Replace NA with pred in var1
mutate(var1 = ifelse(is.na(var1), pred, var1))
# See the result
df3 %>% as.data.frame()
# time var1 var2 pred
# 1 15 20.4 30.60 20.4
# 2 16 31.5 47.25 31.5
# 3 17 42.6 63.90 42.6
# 4 18 53.7 80.55 53.7
# 5 19 64.8 97.20 64.8
# 6 20 75.9 113.85 75.9
# 7 21 87.0 130.50 87.0
# 8 22 98.1 147.15 98.1
# 9 23 109.2 163.80 109.2
# 10 24 120.3 180.45 120.3
# 11 25 131.4 197.10 131.4
# 12 26 142.5 213.75 142.5
这篇关于在R中使用线性回归填充NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文