在 R 中:如何在两个整数之间找到的向量中替换 NA [英] In R: How to replace NA in a Vector found between two integers
问题描述
我有以下向量:
A:(NA NA NA NA 1 NA NA 4 NA NA 1 NA NA NA NA NA 4 NA 1 NA 4)
我想用 2 替换 1 到 4 之间的所有 Nas(但不是 4 和 1 之间的 Nas)
I would like to replace all the Nas between 1 and 4 with 2 (but not the Nas between 4 and 1)
对于这项任务,您有什么建议/使用的方法吗?
Are there any approaches you would recommend/use for this task?
它也可以作为数据帧进行管理:
It may also be managed as a dataframe:
A
----
NA
NA
NA
NA
1
NA
NA
4
NA
NA
1
NA
NA
NA
NA
NA
4
NA
1
NA
4
----
编辑:1.我把字符串Na"改成了NA.
Edit: 1. I changed the string "Na" to NA.
解决方案/更新感谢大家的见解.我从他们那里学到了针对我的案例提出以下解决方案.我希望它对其他人有用:
SOLUTION/UPDATE Thank you to everyone for your insights. I learnt from them to come up with the following solution to my case. I hope it is useful to someone else:
A <- c(df$A)
index.1<-which(df$A %in% c(1)) # define location for 1s in A
index.14<-which(df$A %in% c(1,4)) # define location for 1s and 4s in A
loc.1<-which(index.14 %in% index.1) # location of 1s in index.14
loc.4<-loc.1+1 # location of 4s relative to 1s in index.14
start.i<-((index.14[loc.1])+1) # starting index for replacing with 2
end.i<-((index.14[loc.4])-1) # ending index for replacing with 2 in index
fill.v<-sort(c(start.i, end.i))# sequence of indexes to fill-in with # 2
# create matrix of beginning and ending sequence
fill.m<-matrix(fill.v,nrow = (length(fill.v)/2),ncol = 2, byrow=TRUE)
# create a list with indexes to replace
list.1<-apply(fill.m, MARGIN=1,FUN=function(x) seq(x[1],x[2]))
# unlist list to use as the indexes for replacement
list.2<-unlist(list.1)
df$A[list.2] <- 2 # replace indexed location with 2
推荐答案
假设 A
如最后的注释中所示,cumsum 显示的差异为 1 和4 inclusive 和下一个条件消除端点.最后,我们将剩下的位置为 TRUE 的位置替换为 2.
Assuming A
is as shown reproducibly in the Note at the end, the difference of cumsum's shown gives TRUE for the elements between 1 and 4 inclusive and the next condition eliminates the endpoints. Finally we replace the positions having TRUE in what is left with 2.
replace(A, (cumsum(A == 1) - cumsum(A == 4)) & (A == "Na"), 2)
给予:
[1] "Na" "Na" "Na" "Na" "1" "2" "2" "4" "Na" "Na" "1" "2" "2" "2" "2"
[16] "2" "4" "Na" "1" "2" "4"
NA 值
R 区分大小写,Na 与 NA 不同.问题中的示例数据显示的是 Na 值而不是 NA 值,但如果实际意思是具有 NA 值的数值向量,如以下注释中的 AA
,则将表达式修改为如下所示:
NA values
R is case sensitive and Na is not the same as NA. The sample data in the question showed Na values and not NA values but if what was actually meant was a numeric vector with NA values as in AA
in the Note below then modify the expression to be as shown here:
replace(AA, cumsum(!is.na(AA) & AA == 1) - cumsum(!is.na(AA) & AA == 4) & is.na(AA), 2)
给予:
[1] NA NA NA NA 1 2 2 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4
注意
A <- c("Na", "Na", "Na", "Na", "1", "Na", "Na", "4", "Na", "Na",
"1", "Na", "Na", "Na", "Na", "Na", "4", "Na", "1", "Na", "4")
AA <- as.numeric(replace(A, A == "Na", NA))
这篇关于在 R 中:如何在两个整数之间找到的向量中替换 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!