使用“FUN =第一”跳过NA值 [英] Skip NA values using "FUN=first"

查看:71
本文介绍了使用“FUN =第一”跳过NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能真的是一个简单的解释,我做错了什么,但是我今天已经有一段时间了,我还是不能让这个工作。我以为这会在公园里散步,但是我的代码并不如预期的那样工作。

there's probably really an simple explaination as to what I'm doing wrong, but I've been working on this for quite some time today and I still can not get this to work. I thought this would be a walk in the park, however, my code isn't quite working as expected.

所以在这个例子中,假设我有一个数据框如下。

So for this example, let's say I have a data frame as followed.

df
Row#   user      columnB    
1        1          NA        
2        1          NA        
3        1          NA        
4        1          31        
5        2          NA        
6        2          NA        
7        2          15        
8        3          18        
9        3          16       
10       3          NA

基本上,我想创建一个使用第一个(以及最后一个)函数(在TTR库包中)获取每个用户的第一个非NA值。所以我想要的数据框是这样的。

Basically, I would like to create a new column that uses the first (as well as last) function (within the TTR library package) to obtain the first non-NA value for each user. So my desired data frame would be this.

df
Row#   user      columnB    firstValue
1        1          NA        31
2        1          NA        31 
3        1          NA        31
4        1          31        31
5        2          NA        15
6        2          NA        15 
7        2          15        15
8        3          18        18
9        3          16        18
10       3          NA        18

我主要使用谷歌浏览,但是我找不到我的确切答案。

I've looked around mainly using google, but I couldn't really find my exact answer.

这是我尝试过的一些代码,但我没有得到我想要的结果(注意,我从内存中提取出来,所以在那里这些都是很多变体,但这些都是我一直在尝试的一般形式)。

Here's some of my code that I've tried, but I didn't get the results that I wanted (note, I'm bringing this from memory, so there are quite a few more variations of these, but these are the general forms that I've been trying).

    df$firstValue<-ave(df$columnB,df$user,FUN=first,na.rm=True)
    df$firstValue<-ave(df$columnB,df$user,FUN=function(x){x,first,na.rm=True})
    df$firstValue<-ave(df$columnB,df$user,FUN=function(x){first(x,na.rm=True)})
    df$firstValue<-by(df,df$user,FUN=function(x){x,first,na.rm=True})

失败,这些只是给出每个组的第一个值,这将是NA。

Failed, these just give the first value of each group, which would be NA.

再次,这些只是我头上的几个例子,我用na.rm玩了,使用na.exclude,na.omit,na.action (na.omit)等...

Again, these are just a few examples from the top of my head, I played around with na.rm, using na.exclude, na.omit, na.action(na.omit), etc...

任何帮助将不胜感激。谢谢。

Any help would be greatly appreciated. Thanks.

推荐答案

A data.table 解决方案

require(data.table)
DT <- data.table(df, key="user")
DT[, firstValue := na.omit(columnB)[1], by=user]

这篇关于使用“FUN =第一”跳过NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆