计算NA在非缺失值之间出现的次数 [英] count NA's appearing in between non-missing values

查看:66
本文介绍了计算NA在非缺失值之间出现的次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含5个时间列(第一年,第二年y等)的数据帧,对于某些行,我在非缺失值之间具有NA.下面的示例:

I have a data frame with 5 time columns (1st year, second y, etc), and for some rows I have NA's in between non-missing values. Sample below:

df = structure(list(FirstYStage = c(NA, 3.2, 3.1, NA, NA, 2, 1, 3.2, 
3.1, 1, 2, 5, 2, NA, NA, NA, NA, 2, 3.1, 1), SecondYStage = c(NA, 
3.1, 3.1, NA, NA, 2, 1, 4, 3.1, 1, NA, 5, 3.1, 3.2, 2, 3.1, NA, 
2, 3.1, 1), ThirdYStage = c(NA, NA, 3.1, NA, NA, 3.2, 1, 4, NA, 
1, NA, NA, 3.2, NA, 2, 3.2, NA, NA, 2, 1), FourthYStage = c(NA, 
NA, 3.1, NA, NA, NA, 1, 4, NA, 1, NA, NA, NA, 4, 2, NA, NA, NA, 
2, 1), FifthYStage = c(NA, NA, 2, NA, NA, NA, 1, 5, NA, NA, NA, 
NA, 3.2, NA, 2, 3.2, NA, NA, 2, 1)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -20L))

我想使用dplyr计算在非缺失值之间缺少值的行数.第13、14和16行就是这样的示例.

I would like to count, using dplyr, the number of rows that have missing values in between non-missing values. Rows 13, 14,and 16 are examples of this.

我该如何实现?我觉得这与paste整行并在NA的左侧和右侧查找有关...但是不清楚如何进行此操作.

How can I achieve this? I have a feeling this has to do with pasteing the entire row and looking to the left and to the right of the NA... but not clear how to proceed with this.

推荐答案

sum(grepl("[[:digit:]]+[NA]+[[:digit:]]", apply(df, 1, paste, collapse = "")))

[1] 3

简而言之:

  1. 连接行

  1. concatenate the rows

检查哪些行返回TRUE与字符串匹配:"number-NA-number"(在这种情况下,为行131416)

check which rows return TRUE for having a match to the string: "number-NA-number" (in this case, row 13, 14 and 16)

对求值为TRUE

基于@ Sotos的评论,这甚至更好,因为它阻止我们将applymargin = 1一起使用:

Based on @ Sotos comment, this is even better since it prevents us from using apply with margin = 1:

sum(grepl("[[:digit:]]+[NA]+[[:digit:]]", do.call(paste0, df)))

这篇关于计算NA在非缺失值之间出现的次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆