列表中每个数据框的最长更改时间 [英] Longest run of changes for each dataframe in a list

查看:43
本文介绍了列表中每个数据框的最长更改时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多个数据帧的列表,每个数据帧包含一串日期,对于每个日期,+ 1表示增加,-1表示减少.

I have a list of multiple dataframes, each of which comprises a string of dates and, for each date, either +1 to indicate an increase or -1 for a decrease.

这是一个例子

security1 <- data.frame(
    date = seq(from =as.Date('2019-01-01'), to = as.Date('2019-01-10'), by = 'day'),
    direction = c(1, 1, 1, -1, -1, 1, 1, 1, 1, -1))
security2 <- data.frame(
    date = seq(from =as.Date('2019-01-01'), to = as.Date('2019-01-10'), by = 'day'),
    direction = c(1, -1, 1, -1, -1, 1, 1,- 1, 1, -1))
clcn <- list(Sec1 = security1, Sec2 = security2)

对于每个数据框,我试图找到最近的更改字符串的长度,而上一次该数字大于此长度.如果前一天的移动方向相反,则当前条纹可能只有1天.

For each dataframe, I am trying to find the length of the most recent string of changes and last time the number was bigger than this. It may be that the current streak is just 1 day if the previous day’s move was in the other direction.

我已经寻找了几天的答案,并在

I’ve searched for several days for an answer to this and found the following using sequence and rle for a single dataframe at Compute counting variable in dataframe

sequence(rle(as.character(data$list))$lengths)

但是我正在努力将其输入lapply或映射以使其遍历列表.

But I’m struggling to feed that into lapply or map to get it to iterate over the list.

我不介意确切的输出,但理想情况下,它应包括: 数据框名称,当前条纹,更长的先前条纹以及条纹结束的日期. 但是从最基本的角度来看,仅将序列号添加为数据帧上的新列将提供巨大的帮助,我可以(尝试)从那里获取它.

I don’t mind the exact output, but ideally it would include: Dataframe name, current streak, previous streak that’s longer, and date that streak ended. But at the most basic, just getting the sequence number added as a new column on the dataframe would be a huge help, and I can (try to) take it from there.

推荐答案

@akrun是正确的主意,但是由于您说的是添加到data.frame中,也许:

@akrun has the right idea, but since you said added to the data.frame, perhaps:

library(tidyverse)

clcn %>%
  map(~ mutate(., streak = sequence(rle(direction)$lengths)))

$`Sec1`
         date direction streak
1  2019-01-01         1      1
2  2019-01-02         1      2
3  2019-01-03         1      3
4  2019-01-04        -1      1
5  2019-01-05        -1      2
6  2019-01-06         1      1
7  2019-01-07         1      2
8  2019-01-08         1      3
9  2019-01-09         1      4
10 2019-01-10        -1      1

$Sec2
         date direction streak
1  2019-01-01         1      1
2  2019-01-02        -1      1
3  2019-01-03         1      1
4  2019-01-04        -1      1
5  2019-01-05        -1      2
6  2019-01-06         1      1
7  2019-01-07         1      2
8  2019-01-08        -1      1
9  2019-01-09         1      1
10 2019-01-10        -1      1

从那里,您可以进行更多的mutate通话/添加操作,例如:

From there, you could do more mutate calls / additions, such as:

clcn %>%
  map(
    ~ mutate(
      ., 
      streak = sequence(rle(direction)$lengths), 
      max_streak = streak == max(streak)
    )
  )

这篇关于列表中每个数据框的最长更改时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆