删除主要的NAs以对齐数据 [英] Remove leading NAs to align data
问题描述
我有一个大的 data.frame
与交错的数据,并希望对齐。我的意思是我想采取像
并从所有列中删除前导(顶部)NAs以获取
我从 zoo
包中了解 na.trim
函数,但这不是在上面提出的初始 data.frame
或其转置中。为此,我用转置的数据框 t.df
,
t .df< - na.trim(t.df,sides ='left')
返回一个空的 data.frame
,并且不会按照我想要的方式工作,因为它将创建不同长度的向量。任何人都可以指向一个可能更有帮助的软件包或功能?
这是上面使用的示例代码:
#我有
pre>
的例子$ b var1< - c(1,2,3,4,5,6,7,8,
var2 <-C(6,2,4,7,3,NA,NA,NA,NA,NA)
var3 <-C(NA,NA,8, 6,3,7,NA,NA,NA,NA)
var4 <-C(NA,NA,NA,NA,5,NA,2,6,2,9)
df< - data.frame(var1,var2,var3,var4)
#transpose和(不成功)尝试删除主要的NAs
t .df< - t(df)
t.df< - na.trim(t.df,sides ='left')
解决方案我们可以循环使用列(
lapply(..
)并应用na.trim
,然后通过分配列表
元素在每个元素的末尾填充NAs code>长度
作为列表中的最大长度
元素。library(zoo)
lst< - lapply(df,na.trim)
df []< - lapply (lst,`length< -`,max(length(lst)))
df
#var1 var2 var3 var4
#1 1 6 8 5
#2 2 2 6 NA
## 3 4 3 2
#4 4 7 7 6
#5 5 3 NA 2
#6 6 NA NA 9
#7 7 NA NA NA
#8 8 NA NA NA
#9 9 NA NA NA
#10 10 NA NA NA
或者在评论中提到的G.Grothendieck
replace(df,TRUE, do.call(merge,lapply(lst,zoo)))
I have a large
data.frame
with 'staggered' data and would like to align it. What I mean is I would like to take something likeand remove the leading (top) NAs from all columns to get
I know about the
na.trim
function from thezoo
package, but this didn't work on either the initialdata.frame
presented above or its transpose. For this I used, with transposed dataframet.df
,t.df <- na.trim(t.df, sides = 'left')
This only returned an empty
data.frame
, and wouldn't work the way I wanted anyway since it would create vectors of different lengths. Can anyone point me to a package or function that might be more helpful?Here is the code for my example used above:
# example of what I have var1 <- c(1,2,3,4,5,6,7,8,9,10) var2 <- c(6,2,4,7,3,NA,NA,NA,NA,NA) var3 <- c(NA,NA,8,6,3,7,NA,NA,NA,NA) var4 <- c(NA,NA,NA,NA,5,NA,2,6,2,9) df <- data.frame(var1, var2, var3, var4) # transpose and (unsuccessful) attempt to remove leading NAs t.df <- t(df) t.df <- na.trim(t.df, sides = 'left')
解决方案We can loop over the columns (
lapply(..
) and applyna.trim
. Then, pad NAs at the end of the each of thelist
elements by assigninglength
as the maximum length from thelist
elements.library(zoo) lst <- lapply(df, na.trim) df[] <- lapply(lst, `length<-`, max(lengths(lst))) df # var1 var2 var3 var4 #1 1 6 8 5 #2 2 2 6 NA ## 3 4 3 2 #4 4 7 7 6 #5 5 3 NA 2 #6 6 NA NA 9 #7 7 NA NA NA #8 8 NA NA NA #9 9 NA NA NA #10 10 NA NA NA
Or as @G.Grothendieck mentioned in the comments
replace(df, TRUE, do.call("merge", lapply(lst, zoo)))
这篇关于删除主要的NAs以对齐数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!