R-在由不同数据帧中的值定义的一组列中获取最高/最低值 [英] R - Obtaining the highest/lowest value in a set of columns defined by the value in a different dataframe

查看:38
本文介绍了R-在由不同数据帧中的值定义的一组列中获取最高/最低值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框:一个(A)包含事件的开始日期和结束日期(朱利安日期,所以是连续的天数),另一个(B)包含事件从开始日期到结束日期的值在第一个数据帧中.A中的开始日期是稳定的,结束日期有所不同.

I have two dataframes: one (A) containing the start and end dates (Julian date, so a continuous count of days) of an event, and the other (B) containing values at dates from start to beyond the end dates in the first dataframe. The start date in A is stable, the end date varies.

我希望能够为每一行确定起始和结束之间变化(最高和/或最低值)的最大幅值的值日期在B系列中,然后写入新的数据框.

I want to be able to, for each row, identify the value with the greatest magnitude of change (highest and/or lowest values) between the start and end date in the series in B, then write to a new dataframe.

示例数据帧

dfA <- data.frame(ID = c(1,2,3,4,5), 
                  startDate = rep(1001,5),
                  endDate = c(1007, 1003, 1004, 1005, 1006))

dfB <- data.frame(ID = c(1,2,3,4,5),
                  "1001" = c(0.5,0.3,1,2,1.1),
                  "1002" = c(0.9,0.3,0.5,1.0,1.2), 
                  "1003" = c(0.8,0.3,0.1,1,2), 
                  "1004" = c(1,0.7,0.8,0.9,1.1), 
                  "1005" = c(2,1,3,1,4), 
                  "1006" = c(1,0.5,0.1,0.3,2), 
                  "1007" = c(1,2,3,4,5),
                  "1008" = c(0.5,1,2,1,0.3))

因此,对于 ID = 1 ,我想找到B在 1001 1007 之间的最低值,即开始日期和结束日期.然后将其重复为 ID = 1,2,3 ... n

So, for ID = 1, I want to find the lowest value in B between 1001 and 1007, the start and end dates. This would then be repeated as ID = 1,2,3...n

tidyverse软件包中是否有解决方案?

Is there a solution in the tidyverse package for this?

谢谢.

推荐答案

受马特(Matt)的回答启发,但在时间间隔内()了解了最高值和最低值(据我所知):

Inspired by Matt's answer, but taking highest and lowest values inside the time interval (as I understand the question):

test2 <- left_join(dfA, dfB, by = "ID") %>% 
  pivot_longer(-c(ID, startDate, endDate)) %>% 
  mutate(name = str_remove(name, "X")) %>% 
  filter(name >= startDate & name <= endDate) %>% #here we keep only the rows with name between startDate and endDate
  group_by(ID) %>%
  mutate(highest = max(value), 
         lowest = min(value)) %>% 
  select(ID, highest, lowest) %>% 
  distinct()

这篇关于R-在由不同数据帧中的值定义的一组列中获取最高/最低值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆