R:按组在 data.table 列中查找第一个非 NA 观察值 [英] R: find first non-NA observation in data.table column by group

查看:16
本文介绍了R:按组在 data.table 列中查找第一个非 NA 观察值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 data.table 有很多缺失值,我想要一个变量,它为每组中的第一个非缺失值提供 1.

I have a data.table with many missing values and I want a variable which gives me a 1 for the first non-missin value in each group.

假设我有这样一个data.table:

Say I have such a data.table:

library(data.table)
DT <- data.table(iris)[,.(Petal.Width,Species)]
DT[c(1:10,15,45:50,51:70,101:134),Petal.Width:=NA]

现在在开头、结尾和中间都有缺失.我试过两个版本,一个是:

which now has missings in the beginning, at the end and in between. I have tried two versions, one is:

DT[min(which(!is.na(Petal.Width))),first_available:=1,by=Species]

但它只找到全局最小值(在这种情况下,setosa 得到正确的 1),而不是组的最小值.我认为是这种情况,因为 data.table 首先是 i 的子集,然后按组排序,对吗?所以它只适用于 which(!is.na(Petal.Width)) 的全局最小值的行,它是第一个非 NA 值.

but it only finds the global minimum (in this case, setosa gets the correct 1), not the minimum by group. I think this is the case because data.table first subsets by i, then sorts by group, correct? So it will only work with the row that is the global minimum of which(!is.na(Petal.Width)) which is the first non-NA value.

第二次尝试 j 中的测试:

A second attempt with the test in j:

DT[,first_available:= ifelse(min(which(!is.na(Petal.Width))),1,0),by=Species]

它只返回一列 1.在这里,我没有很好的解释为什么它不起作用.

which just returns a column of 1s. Here, I don't have a good explanation as to why it doesn't work.

我的目标是:

DT[,first_available:=0]
DT[c(11,71,135),first_available:=1]

但实际上我有数百个组.任何帮助将不胜感激!

but in reality I have hundreds of groups. Any help would be appreciated!

这个问题接近但不是针对 NA 的,如果我理解正确,也不能解决这里的问题.我试过了:

this question does come close but is not targeted at NA's and does not solve the issue here if I understand it correctly. I tried:

DT <- data.table(DT, key = c('Species'))
DT[unique(DT[,key(DT), with = FALSE]), mult = 'first']

推荐答案

这是一种方法:

DT[!is.na(Petal.Width), first := as.integer(seq_len(.N) == 1L), by = Species]

这篇关于R:按组在 data.table 列中查找第一个非 NA 观察值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆