删除主要的NAs以对齐数据 [英] Remove leading NAs to align data

查看:115
本文介绍了删除主要的NAs以对齐数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大的 data.frame 与交错的数据,并希望对齐。我的意思是我想采取像





并从所有列中删除前导(顶部)NAs以获取





我从 zoo 包中了解 na.trim 函数,但这不是在上面提出的初始 data.frame 或其转置中。为此,我用转置的数据框 t.df

  t .df<  -  na.trim(t.df,sides ='left')

返回一个空的 data.frame ,并且不会按照我想要的方式工作,因为它将创建不同长度的向量。任何人都可以指向一个可能更有帮助的软件包或功能?



这是上面使用的示例代码:

 #我有

的例子$ b var1< - c(1,2,3,4,5,6,7,8,
var2 <-C(6,2,4,7,3,NA,NA,NA,NA,NA)
var3 <-C(NA,NA,8, 6,3,7,NA,NA,NA,NA)
var4 <-C(NA,NA,NA,NA,5,NA,2,6,2,9)

df< - data.frame(var1,var2,var3,var4)


#transpose和(不成功)尝试删除主要的NAs

t .df< - t(df)

t.df< - na.trim(t.df,sides ='left')
pre>

解决方案

我们可以循环使用列( lapply(.. )并应用 na.trim ,然后通过分配列表元素在每个元素的末尾填充NAs code>长度作为列表中的最大长度元素。

  library(zoo)
lst< - lapply(df,na.trim)
df []< - lapply (lst,`length< -`,max(length(lst)))
df
#var1 var2 var3 var4
#1 1 6 8 5
#2 2 2 6 NA
## 3 4 3 2
#4 4 7 7 6
#5 5 3 NA 2
#6 6 NA NA 9
#7 7 NA NA NA
#8 8 NA NA NA
#9 9 NA NA NA
#10 10 NA NA NA

或者在评论中提到的G.Grothendieck

  replace(df,TRUE, do.call(merge,lapply(lst,zoo)))


I have a large data.frame with 'staggered' data and would like to align it. What I mean is I would like to take something like

and remove the leading (top) NAs from all columns to get

I know about the na.trim function from the zoo package, but this didn't work on either the initial data.frame presented above or its transpose. For this I used, with transposed dataframe t.df,

t.df <- na.trim(t.df, sides = 'left')

This only returned an empty data.frame, and wouldn't work the way I wanted anyway since it would create vectors of different lengths. Can anyone point me to a package or function that might be more helpful?

Here is the code for my example used above:

# example of what I have

var1 <- c(1,2,3,4,5,6,7,8,9,10)
var2 <- c(6,2,4,7,3,NA,NA,NA,NA,NA)
var3 <- c(NA,NA,8,6,3,7,NA,NA,NA,NA)
var4 <- c(NA,NA,NA,NA,5,NA,2,6,2,9)

df <- data.frame(var1, var2, var3, var4)


# transpose and (unsuccessful) attempt to remove leading NAs

t.df <- t(df)

t.df <-  na.trim(t.df, sides = 'left')

解决方案

We can loop over the columns (lapply(..) and apply na.trim. Then, pad NAs at the end of the each of the list elements by assigning length as the maximum length from the list elements.

library(zoo)
lst <- lapply(df, na.trim)
df[] <- lapply(lst, `length<-`, max(lengths(lst)))
df
#   var1 var2 var3 var4
#1     1    6    8    5
#2     2    2    6   NA
##     3    4    3    2
#4     4    7    7    6
#5     5    3   NA    2
#6     6   NA   NA    9
#7     7   NA   NA   NA
#8     8   NA   NA   NA
#9     9   NA   NA   NA
#10   10   NA   NA   NA

Or as @G.Grothendieck mentioned in the comments

replace(df, TRUE, do.call("merge", lapply(lst, zoo)))

这篇关于删除主要的NAs以对齐数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆