将mutate与条件值组合 [英] Combine mutate with conditional values
问题描述
我的数据框(实际上是一个较短的版本)看起来像这样:
V1 V2 V3 V4
1 1 2 3 5
2 2 4 4 1
3 1 4 1 1
4 4 5 1 3
5 5 5 5 4
第五列(V5)的值基于一些条件规则:
if(V1 == 1& V2!= 4){
V5< - 1
}
else if(V2 == 4& V3!= 1){
V5< - 2
}
else {
V5< - 0
}
我想使用mutate函数在所有行上使用这些规则(所以我不必使用慢循环)。这样的东西(是的,我知道它不会这样工作!):</ p>
myfile< - mutate(myfile如果(V1 == 1& V2!= 4){V5 = 1}
else if(V2 == 4& V3!= 1){V5 = 2}
else {V5 = 0})
这应该是结果:
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
如何做在 dplyr
?
尝试这样:
> myfile%>%mutate(V5 =(V1 == 1& V2!= 4)+ 2 *(V2 == 4& V3!= 1))
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
或此:
> myfile%>%mutate(V5 = ifelse(V1 == 1& V2!= 4,1,ifelse(V2 == 4& V3!= 1,2,0)))
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
建议您为数据框获得更好的名称。 myfile使它看起来好像拥有一个文件名。
上面使用了这个输入:
myfile< -
structure(list(V1 = c(1L,2L,1L,4L,5L),V2 = c(2L,4L,4L,
5L,5L ),V3 = c(3L,4L,1L,1L,5L),V4 = c(5L,1L,1L,3L,4L
)).Names = c(V1,V2 V3,V4),class =data.frame,row.names = c(1,
2,3,4,5))
更新由于最初发布的dplyr已更改%。
到%>%
所以修改了相应的答案。
In a large dataframe ("myfile") with four columns I have to add a fifth column with values conditonally based on the first four columns. Recently I have become a huge fan of dplyr, mainly because of its speed in large datasets. So I was wondering if I could deal with my problem using the mutate function.
My dataframe (actually a shorter version of it) looks a bit like this:
V1 V2 V3 V4
1 1 2 3 5
2 2 4 4 1
3 1 4 1 1
4 4 5 1 3
5 5 5 5 4
The values of the fifth column (V5) are based on some conditional rules:
if (V1==1 & V2!=4){
V5 <- 1
}
else if (V2==4 & V3!=1){
V5 <- 2
}
else {
V5 <- 0
}
Now I want to use the mutate function to use these rules on all rows (so I don't have to use a slow loop). Something like this (and yes, I know it doesn't work this way!):
myfile <- mutate(myfile, if (V1==1 & V2!=4){V5 = 1}
else if (V2==4 & V3!=1){V5 = 2}
else {V5 = 0})
This should be the result:
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
How to do this in dplyr
?
Try this:
> myfile %>% mutate(V5 = (V1 == 1 & V2 != 4) + 2 * (V2 == 4 & V3 != 1))
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
or this:
> myfile %>% mutate(V5 = ifelse(V1 == 1 & V2 != 4, 1, ifelse(V2 == 4 & V3 != 1, 2, 0)))
V1 V2 V3 V4 V5
1 1 2 3 5 1
2 2 4 4 1 2
3 1 4 1 1 0
4 4 5 1 3 0
5 5 5 5 4 0
Suggest you get a better name for your data frame. myfile makes it seem as if it holds a file name.
Above used this input:
myfile <-
structure(list(V1 = c(1L, 2L, 1L, 4L, 5L), V2 = c(2L, 4L, 4L,
5L, 5L), V3 = c(3L, 4L, 1L, 1L, 5L), V4 = c(5L, 1L, 1L, 3L, 4L
)), .Names = c("V1", "V2", "V3", "V4"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Update Since originally posted dplyr has changed %.%
to %>%
so have modified answer accordingly.
这篇关于将mutate与条件值组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!