如何简化前导NA计数功能,并将其推广到矩阵,数据框上 [英] How to simplify a leading-NA count function, and generalize it to work on matrix, dataframe
本文介绍了如何简化前导NA计数功能,并将其推广到矩阵,数据框上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我写了一个Leading-NA count函数,它适用于向量.但是:
I wrote a leading-NA count function, it works on vectors. However:
a)您可以简化我的版本吗?
a) Can you simplify my version?
b)您还可以将其概括为直接在矩阵,数据帧上工作(必须仍然在单个矢量上工作),所以我不需要apply()
吗?尽量避免使用所有*apply
函数,将其完全矢量化,它仍必须在矢量上工作,并且尽可能不使用特殊大小写.
b) Can you also generalize it to work directly on matrix, dataframe (must still work on individual vector), so I don't need apply()
? Try to avoid all *apply
functions, fully vectorize, it must still work on a vector, and no special-casing if at all possible.
leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
# v0.1: works but seems clunky, slow and unlikely to be generalizable to two-dimensional objects
leading_NA_count <- function(x) { max(which(1:(length(x)) == cumsum(is.na(x))), 0) }
# v0.2: maybe simpler, needs max(...,0) to avoid max failing with -Inf if the which(...) is empty/ no leading-NAs case: e.g. c(1,2,3)
# (Seems impossible to figure out how to use which.max/which.min on this)
leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
set.seed(1234)
mm <- matrix(sample(c(NA,NA,NA,NA,NA,0,1,2), 6*5, replace=T), nrow=6,ncol=5)
mm
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] NA NA 2 NA 1
[3,] NA 0 NA NA NA
[4,] NA NA 1 NA 2
[5,] 1 0 NA NA 1
[6,] 0 NA NA NA NA
leading_NA_count(mm)
[1] 4 # WRONG, obviously (looks like it tried to operate on the entire matrix by-column or by-row)
apply(mm,1,leading_NA_count)
[1] 5 2 1 2 0 0 # RIGHT
推荐答案
无论mm
是matrix
,vector
还是data.frame
,此方法都有效.有关更多信息,请参见?max.col
:
This works whether mm
is a matrix
, vector
or data.frame
. See ?max.col
for more info:
max.col(cbind(!is.na(rbind(NA, mm)), TRUE), ties = "first")[-1] - 1
这篇关于如何简化前导NA计数功能,并将其推广到矩阵,数据框上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文