将多列折叠为一列并生成索引变量 [英] collapse mulitple columns into one column and generate an index variable
问题描述
我有三个日期列,如下所示
I have three date columns as shown below
Id Date1 Date2 Date3
12 2005-12-22 NA NA
11 2009-10-11 NA NA
29 NA 2005-04-11 NA
45 NA NA 2008-11-06
39 NA NA 2006-01-02
44 NA 2005-04-16 NA
我正在尝试折叠三个日期列到一个日期列,并创建一个索引变量,如果日期1列中有一个日期值,则创建2;如果日期2列中有一个日期,则创建2;如果在日期3列中有一个日期,则创建一个索引变量
I am trying to collapse the three Date columns in to one Date column and create an index variable 1 if there was a Date value in Date1 column, 2 if there was a Date in Date2 column and 3 if there was a Date in Date3 column
Id Date Index
12 2005-12-22 1
11 2009-10-11 1
29 2005-04-11 2
45 2008-11-06 3
39 2006-01-02 3
44 2005-04-16 2
我可以使用大量的ifelse语句来做到这一点,我想知道是否有人知道一种有效的方法吗?
I can do this using a massive ifelse statement i was wondering if anybody knew an efficient method of doing this ?
推荐答案
这是重塑$ c $的经典用法c>从宽到长格式。如果
d
是您的data.frame:
This is a classic use of reshape
to go from "wide" to "long" format. If d
is your data.frame:
d2 <- reshape(d, idvar = "Id", v.names = "Date", timevar = "Index",
varying = c("Date1", "Date2", "Date3"), direction = "long")
结果:
> d2
Id Index Date
12.1 12 1 2005-12-22
11.1 11 1 2009-10-11
29.1 29 1 <NA>
45.1 45 1 <NA>
39.1 39 1 <NA>
44.1 44 1 <NA>
12.2 12 2 <NA>
11.2 11 2 <NA>
29.2 29 2 2005-04-11
45.2 45 2 <NA>
39.2 39 2 <NA>
44.2 44 2 2005-04-16
12.3 12 3 <NA>
11.3 11 3 <NA>
29.3 29 3 <NA>
45.3 45 3 2008-11-06
39.3 39 3 2006-01-02
44.3 44 3 <NA>
如果您不希望所有 NA
值(在上方)可以子集:
If you don't want all the NA
values (above) you can subset:
> d2[!is.na(d2$Date),]
Id Index Date
12.1 12 1 2005-12-22
11.1 11 1 2009-10-11
29.2 29 2 2005-04-11
44.2 44 2 2005-04-16
45.3 45 3 2008-11-06
39.3 39 3 2006-01-02
这篇关于将多列折叠为一列并生成索引变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!