连续 NA 数 [英] Number of consecutive NA

查看:82
本文介绍了连续 NA 数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据是这样的

subject x1   x2   x3   x4   x5   x6   x7        
a       0.1  NA   0.2  0.1  0.1  NA   0.9        
b       NA   NA  -0.01 NA   0.3  0.8  0.01
c       NA   NA   NA   NA   NA   0.9  0.4
d       NA   NA  0.01  NA   NA   NA   0.05

如何将新变量MAX 连续 NA 的数量"附加到此数据框?

How can I append new variable "the number of MAX consecutive NA" to this data.frame?

subject x1   x2   x3   x4   x5   x6   x7    NA_consecutive    
a       0.1  NA   0.2  0.1  0.1  NA   0.9        1
b       NA   NA  -0.01 NA   0.3  0.8  0.01       2 (max NA, not 1!!)
c       NA   NA   NA   NA   NA   0.9  0.4        5
d       NA   NA  0.01  NA   NA   NA   0.05       3 (max NA, not 2!!)

我想按主题(即一行)计算连续 NA 的数量.简单地说,我尝试使用 duplicate 但它向我显示了任何重复的内容,包括正常值,而不是 NA.

I want to calculate the number of consecutive NA by subject(i.e, a row). Simply, I try to use duplicate but It shows me anything duplicated including normal value, not NA.

如果我将此数据集转换为long",df %>% gather(variable, value, -subject)

If I transform this data set to "long", df %>% gather(variable, value, -subject)

   subject variable  value
 1 a       x1         0.1 
 2 a       x2         NA   
 3 a       x3         0.2 
 4 a       x4         0.1 
 5 a       x5         0.1 
 6 a       x6         NA   
 7 a       x7         0.9 
 8 b       x1         NA   
 9 b       x2         NA   
10 b       x3        -0.01
..

这个表格更简单吗?

我不在乎任何形式的形式,我应该得到新的信息(MAX 连续不适用).

I don't care any shape of form, I should get new information (MAX consecutive NA).

如果可能,避免for循环"(但不是完全),因为这个数据非常大.

If possible, avoid "for loop"(but not completely) because this data is very large.

推荐答案

这里有一个 tidyverse 选项

df %>%
    gather(k, v, -subject) %>%
    arrange(subject, k) %>%
    group_by(subject) %>%
    mutate(grp = cumsum(c(0, abs(diff(!is.na(v))) == 1))) %>%
    add_count(subject, grp) %>%
    mutate(NA_consecutive = max(n[is.na(v)])) %>%
    select(-grp, -n) %>%
    spread(k, v)
## A tibble: 4 x 9
## Groups:   subject [4]
#  subject NA_consecutive     x1    x2       x3     x4     x5     x6     x7
#  <fct>            <int>  <dbl> <dbl>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#1 a                    1  0.100    NA   0.200   0.100  0.100 NA     0.900
#2 b                    2 NA        NA  -0.0100 NA      0.300  0.800 0.0100
#3 c                    5 NA        NA  NA      NA     NA      0.900 0.400
#4 d                    3 NA        NA   0.0100 NA     NA     NA     0.0500

这篇关于连续 NA 数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆