迄今使用具有ifelse条件的mutate时的排序问题 [英] Ordering problems when using mutate with ifelse condition to date
问题描述
我正在尝试使用mutate创建一列,该列将一列的值取到一个点,然后使用cumprod
根据另一列的值填充其余的观察值.
I'm trying to use mutate to create a column that takes the value of one column up to a point and then uses cumprod
to fill the rest of the observations based on the values of another column.
我尝试将mutate
与ifelse
组合在一起,但是语句的顺序不正确,我不知道为什么
I tried combining mutate
with ifelse
but the order of the statements is not correct and I can't figure out why
下面我重现了一个更基本的示例,该示例重复了我的问题:
Below I reproduce a more basic example that replicates my problem:
foo1 <- data.frame(date=seq(2005,2018,1))
foo1 %>% mutate(h=ifelse(date>2008, seq(1,11,1), 99))
输出为:
date h
1 2005 99
2 2006 99
3 2007 99
4 2008 99
5 2009 5
6 2010 6
7 2011 7
8 2012 8
9 2013 9
10 2014 10
11 2015 1
12 2016 2
13 2017 3
14 2018 4
我希望它是:
date h
1 2005 99
2 2006 99
3 2007 99
4 2008 99
5 2009 1
6 2010 2
7 2011 3
8 2012 4
9 2013 5
10 2014 6
11 2015 7
12 2016 8
13 2017 9
14 2018 10
下面,我重现另一个示例(与我要尝试的操作更接近).
Below I reproduce another example (more close to what I'm trying to do).
foo2 <- data.frame(date=seq(2005,2013,1), a=seq(1, by=1, length.out = 9), b=rep(1.01, length.out = 9))
foo2 %>% mutate(h=ifelse(date>2008, cumprod(c(a[5],b[5:9])), a))
我的输出是:
date a b h
1 2005 1 1.01 1.00000
2 2006 2 1.01 2.00000
3 2007 3 1.01 3.00000
4 2008 4 1.01 4.00000
5 2009 5 1.01 5.20302
6 2010 6 1.01 5.25505
7 2011 7 1.01 5.00000
8 2012 8 1.01 5.05000
9 2013 9 1.01 5.10050
我希望它是:
date a b h
1 2005 1 1.01 1.00000
2 2006 2 1.01 2.00000
3 2007 3 1.01 3.00000
4 2008 4 1.01 4.00000
5 2009 5 1.01 5.00000
6 2010 6 1.01 5.05000
7 2011 7 1.01 5.10050
8 2012 8 1.01 5.20302
9 2013 9 1.01 5.25505
如果我使用if_else而不是ifelse
,则会收到以下错误:
If I use if_else instead of ifelse
, I receive the following error:
Error in mutate_impl(.data, dots) :
Evaluation error: `true` must be length 9 (length of `condition`) or one, not 6
推荐答案
ifelse
函数采用三个参数:
-
test
:logical
向量.假设它的长度为N
. -
yes
:一个向量.它可以是任何长度.如果长度不是N
,则向量将被回收/缩短为长度N
-
no
:与yes
相同.
test
: alogical
vector. Say that it has a length ofN
.yes
: a vector. It can be of any length. If the length is notN
, the vector is recycled/shortened to be of lengthN
no
: same asyes
.
在此预处理阶段的最后,您具有3个相同长度的向量. ifelse
然后根据test
选择第二个向量或第三个向量来构建返回值.
At the end of this preprocessing stage, you have 3 same length vectors. ifelse
then builds the return value selecting the second vector or the third vector depending on test
.
在您的情况下,我们有:
In your case we have:
test <- foo1$date>2008 #length: 14
yes <- seq(1,11,1) #length: 11
no <- 99 #length: 1
因此,它需要回收yes
和no
.您最终会得到类似的东西:
So, it needs to recycle both yes
and no
. You end up with something like:
test yes no
FALSE 1 99
FALSE 2 99
FALSE 3 99
FALSE 4 99
TRUE 5 99
TRUE 6 99
TRUE 7 99
TRUE 8 99
TRUE 9 99
TRUE 10 99
TRUE 11 99
TRUE 1 99
TRUE 2 99
TRUE 3 99
您会看到回收的工作方式.然后,要构建返回值,如果test
是TRUE
,则ifelse
按照上面的顺序选择yes
元素,否则选择no
元素.这说明了为什么拥有该返回值.当然不是dplyr
.
You see how the recycle works. Then, to build the return value, ifelse
selects, in the order above, yes
elements if test
is TRUE
and no
elements otherwise. This explain why you have that return value. It's not about dplyr
of course.
这篇关于迄今使用具有ifelse条件的mutate时的排序问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!