R中嵌套ifelse语句的替代方法 [英] Alternatives to nested ifelse statements in R
问题描述
假设我们有以下数据。行代表一个国家/地区( in05:in09
)表示该国家/地区是否存在于给定年份的相关数据库中( 2005: 2009
)。
Suppose we have the following data. The rows represent a country and the columns (in05:in09
) indicate whether that country was present in a database of interest in the given year (2005:2009
).
id <- c("a", "b", "c", "d")
in05 <- c(1, 0, 0, 1)
in06 <- c(0, 0, 0, 1)
in07 <- c(1, 1, 0, 1)
in08 <- c(0, 1, 1, 1)
in09 <- c(0, 0, 0, 1)
df <- data.frame(id, in05, in06, in07, in08, in09)
I想要创建一个变量 firstyear
,它表示该国家在数据库中存在的第一年。现在我执行以下操作:
I want to create a variable firstyear
which indicates the first year in which the country was present in the database. Right now I do the following:
df$firstyear <- ifelse(df$in05==1,2005,
ifelse(df$in06==1,2006,
ifelse(df$in07==1, 2007,
ifelse(df$in08==1, 2008,
ifelse(df$in09==1, 2009,
0)))))
上述代码已经不是很好了,我的数据集包含了很多年。有没有替代方法,使用 * apply
函数,循环或其他东西,来创建这个第一个
变量?
The above code is already not very nice, and my dataset contains many more years. Is there an alternative, using *apply
functions, loops or something else, to create this firstyear
variable?
推荐答案
你可以使用 max.col进行矢量化
indx <- names(df)[max.col(df[-1], ties.method = "first") + 1L]
df$firstyear <- as.numeric(sub("in", "20", indx))
df
# id in05 in06 in07 in08 in09 firstyear
# 1 a 1 0 1 0 0 2005
# 2 b 0 0 1 1 0 2007
# 3 c 0 0 0 1 0 2008
# 4 d 1 1 1 1 1 2005
这篇关于R中嵌套ifelse语句的替代方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!