迄今使用具有ifelse条件的mutate时的排序问题 [英] Ordering problems when using mutate with ifelse condition to date

查看:266
本文介绍了迄今使用具有ifelse条件的mutate时的排序问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用mutate创建一列,该列将一列的值取到一个点,然后使用cumprod根据另一列的值填充其余的观察值.

I'm trying to use mutate to create a column that takes the value of one column up to a point and then uses cumprod to fill the rest of the observations based on the values of another column.

我尝试将mutateifelse组合在一起,但是语句的顺序不正确,我不知道为什么

I tried combining mutate with ifelse but the order of the statements is not correct and I can't figure out why

下面我重现了一个更基本的示例,该示例重复了我的问题:

Below I reproduce a more basic example that replicates my problem:

foo1 <- data.frame(date=seq(2005,2018,1))
foo1 %>% mutate(h=ifelse(date>2008, seq(1,11,1), 99))

输出为:

   date  h
1  2005 99
2  2006 99
3  2007 99
4  2008 99
5  2009  5
6  2010  6
7  2011  7
8  2012  8
9  2013  9
10 2014 10
11 2015  1
12 2016  2
13 2017  3
14 2018  4

我希望它是:

   date  h
1  2005 99
2  2006 99
3  2007 99
4  2008 99
5  2009  1
6  2010  2
7  2011  3
8  2012  4
9  2013  5
10 2014  6
11 2015  7
12 2016  8
13 2017  9
14 2018 10

下面,我重现另一个示例(与我要尝试的操作更接近).

Below I reproduce another example (more close to what I'm trying to do).

foo2 <- data.frame(date=seq(2005,2013,1), a=seq(1, by=1, length.out = 9), b=rep(1.01, length.out = 9))
foo2 %>% mutate(h=ifelse(date>2008, cumprod(c(a[5],b[5:9])), a))

我的输出是:

  date a    b       h
1 2005 1 1.01 1.00000
2 2006 2 1.01 2.00000
3 2007 3 1.01 3.00000
4 2008 4 1.01 4.00000
5 2009 5 1.01 5.20302
6 2010 6 1.01 5.25505
7 2011 7 1.01 5.00000
8 2012 8 1.01 5.05000
9 2013 9 1.01 5.10050

我希望它是:

  date a    b       h
1 2005 1 1.01 1.00000
2 2006 2 1.01 2.00000
3 2007 3 1.01 3.00000
4 2008 4 1.01 4.00000
5 2009 5 1.01 5.00000
6 2010 6 1.01 5.05000
7 2011 7 1.01 5.10050
8 2012 8 1.01 5.20302
9 2013 9 1.01 5.25505

如果我使用if_else而不是ifelse,则会收到以下错误:

If I use if_else instead of ifelse, I receive the following error:

Error in mutate_impl(.data, dots) : 
  Evaluation error: `true` must be length 9 (length of `condition`) or one, not 6

推荐答案

ifelse函数采用三个参数:

  1. test:logical向量.假设它的长度为N.
  2. yes:一个向量.它可以是任何长度.如果长度不是N,则向量将被回收/缩短为长度N
  3. no:与yes相同.
  1. test: a logical vector. Say that it has a length of N.
  2. yes: a vector. It can be of any length. If the length is not N, the vector is recycled/shortened to be of length N
  3. no: same as yes.

在此预处理阶段的最后,您具有3个相同长度的向量. ifelse然后根据test选择第二个向量或第三个向量来构建返回值.

At the end of this preprocessing stage, you have 3 same length vectors. ifelse then builds the return value selecting the second vector or the third vector depending on test.

在您的情况下,我们有:

In your case we have:

test <- foo1$date>2008 #length: 14
yes <- seq(1,11,1) #length: 11
no <- 99 #length: 1

因此,它需要回收yesno.您最终会得到类似的东西:

So, it needs to recycle both yes and no. You end up with something like:

 test yes no
FALSE   1 99
FALSE   2 99
FALSE   3 99
FALSE   4 99
 TRUE   5 99
 TRUE   6 99
 TRUE   7 99
 TRUE   8 99
 TRUE   9 99
 TRUE  10 99
 TRUE  11 99
 TRUE   1 99
 TRUE   2 99
 TRUE   3 99

您会看到回收的工作方式.然后,要构建返回值,如果testTRUE,则ifelse按照上面的顺序选择yes元素,否则选择no元素.这说明了为什么拥有该返回值.当然不是dplyr.

You see how the recycle works. Then, to build the return value, ifelse selects, in the order above, yes elements if test is TRUE and no elements otherwise. This explain why you have that return value. It's not about dplyr of course.

这篇关于迄今使用具有ifelse条件的mutate时的排序问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆