合并列以删除NA,但仍优先处理特定的替换项 [英] Combine column to remove NA's yet prioritize specific replacements
问题描述
我正在学习使用上一篇文章更新列数据.但是,有一种技巧可以指定在发生冲突的情况下哪一列应提供最终的更新值.例如,只要每行仅存在一个值,我就可以合并数据列:
I'm learning to update column data using this previous post. However, is there a trick for specifying which column should provide the final updated value in case of a conflict. For example, I can combine columns of data as long as only one value exists per row:
data <- data.frame('a' = c('A','B','C','D','E'),
'x' = c(NA,NA,3,NA,NA),
'y' = c(1,2,NA,NA,NA),
'z' = c(NA,NA,NA,4,5))
cbind.data.frame(data3[1], mycol=c(na.omit(c(t(data3[, -1])))))
在以下情况下,我将如何强制值来自newVal
?
How would I force the value to come from newVal
in the following case?
data <- data.frame('a' = c('A','B','C','D','E','F'),
'x' = c(NA,NA,NA,3,NA,NA),
'y' = c(1,2,8,NA,NA,NA),
'z' = c(99,NA,4,NA,4,5))
推荐答案
使用max.col
和一些矩阵索引(指定采用哪种行/列组合):
Use max.col
and some matrix indexing (specifying which row/col combination to take):
cbind(1:nrow(data), max.col(!is.na(data[-1]), "last"))
# [,1] [,2]
#[1,] 1 3
#[2,] 2 2
#[3,] 3 3
#[4,] 4 1
#[5,] 5 3
#[6,] 6 3
data[-1][cbind(1:nrow(data), max.col(!is.na(data[-1]), "last"))]
#[1] 99 2 4 3 4 5
cbind(data[1], result=data[-1][cbind(1:nrow(data), max.col(!is.na(data[-1]), "last"))])
# a result
#1 A 99
#2 B 2
#3 C 4
#4 D 3
#5 E 4
#6 F 5
如果需要始终为特定的列赋予优先级,请以特定的顺序使这些列成为一个临时对象,然后对其进行处理:
If you need a particular column to always be given precedence, make a temporary object with the columns in a particular order, and then process it:
tmp <- data[-1][c("z", setdiff(names(data[-1]), "z"))]
tmp[cbind(1:nrow(tmp), max.col(!is.na(tmp), "first"))]
#[1] 99 2 4 3 4 5
这篇关于合并列以删除NA,但仍优先处理特定的替换项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!