如果另一个变量等于R中的设置值,如何使用来自不同变量的值创建新变量? [英] How to create a new variable with values from different variables if another variable equals a set value in R?

查看:163
本文介绍了如果另一个变量等于R中的设置值,如何使用来自不同变量的值创建新变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个复杂的问题,我将尝试通过简化数据集来简化。假设我有5个变量:

I have a complicated question that I will try to simplify by simplifying my dataset. Say I have 5 variables:

df$Id <- c(1:12)
df$Date <- c(NA,NA,a,a,b,NA,NA,b,c,c,b,a)
df$va <- c(1.1, 1.4, 2.5, ...)     #12 randoms values
df$vb <- c(5.9, 2.3, 4.7, ...)     #12 other random values
df$vc <- c(3.0, 3.3, 3.7, ...)     #12 more random values

然后我想创建一个新变量,该变量取自va,vb或vc(如果日期等于a,b或c)。我尝试了一个嵌套的if-else,它不起作用。我也尝试过:

Then I want to create a new variable that takes the value from va, vb, or vc if the date is equal to a, b, or c. I had tried a nested if-else, which did not work. I also tried:

df$new[df$date=='a' & !is.na(df$date)] <- df$va
df$new[df$date=='b' & !is.na(df$date)] <- df$vb
df$new[df$date=='c' & !is.na(df$date)] <- df$vc

这正确地将NA留在了新的变量,其中Date = NA,但是提供的值不是来自va,vb或vc,而是完全来自其他一些值。如果日期为 a,如何使df $ new等于va;如果日期为 b,如何使df $ new等于va?如果日期为 c,则如何为vc?

This correctly left NA's in the new variable where Date=NA, however the values provided were not from va, vb, or vc, but some other value altogether. How can I get df$new to equal va if the date is 'a', vb if the date is 'b', and vc if the date is 'c'?

推荐答案

有人告诉我我的代码存在的问题是我需要在任一侧进行索引。如果在右侧没有索引,则它不知道从哪个行应用该值。因此,在这种情况下,正确的代码应为:

I was told the problem with my code is that I needed to put indexing on either side. Without the indexing on the right side, it does not know which row to apply the value from. So the correct code in this case would be:

df$new[df$date=='a' & !is.na(df$date)] <- df$va[df$date=='a' & !is.na(df$date)]
df$new[df$date=='b' & !is.na(df$date)] <- df$vb[df$date=='b' & !is.na(df$date)]
df$new[df$date=='c' & !is.na(df$date)] <- df$vc[df$date=='c' & !is.na(df$date)]

或者,另一位用户指出有一种使用方法ifelse,可以在此处将其视为正确答案: https://stats.stackexchange.com/questions/151345/how-to-create-a-new-variable-with-values-from-different-variables-if-another- var

Alternatively, another user noted there is a way to use ifelse, which can be viewed as the correct answer here: https://stats.stackexchange.com/questions/151345/how-to-create-a-new-variable-with-values-from-different-variables-if-another-var

当我在该链接上添加他的答案时,我发现更好的方法是用%in%替换==,以便创建一个数字变量,而不是我的数据集中的每个36121观测值的带有行的列表(在我提供的示例中为12)。看起来像是:

As I added to his answer at that link, I found what worked better was to replace the == with %in%, so that it created a numeric variable instead of a list with a row for each of the 36121 observations in my dataset (12 in the example I provided). That would look like:

df$new[df$date %in% 'a' & !is.na(df$date)] <- df$va[df$date %in% 'a' & !is.na(df$date)]
df$new[df$date %in% 'b' & !is.na(df$date)] <- df$vb[df$date %in% 'b' & !is.na(df$date)]
df$new[df$date %in% 'c' & !is.na(df$date)] <- df$vc[df$date %in% 'c' & !is.na(df$date)]

这篇关于如果另一个变量等于R中的设置值,如何使用来自不同变量的值创建新变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆