在R中的另一个数据帧之间的时间间隔内对值进行求和 [英] Summing up values within an interval from another data.frame in R

查看:195
本文介绍了在R中的另一个数据帧之间的时间间隔内对值进行求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在hkdata.2中有成千上万的条目,我想创建一个循环,可以帮助我总结从另一个数据帧data.1的每个memberID的data.2中的每个成员的总暴露的mxtem。



任何专家可以帮我一下吗?

  weather.data 
日期mpressure mxtemp
1 2008-01-01 1025.3 15.7
2 2008-01-02 1025.6 16.0
3 2008-01-03 1023.6 18.1
4 2008-01 -04 1021.8 18.4
5 2008-01-05 1020.1 20.9
6 2008-01-06 1019.7 20.7
7 2008-01-07 1018.4 24.0
8 2008-01-08 1016.7 23.7

hkdata.2
row.names houseID成员男性date.end date.begin
1 1 1 1 2008-01-07 2008-01-02
2 1 2 0 2008-01-06 2008-01-04

我想得到总和从同一个mem的date.begin和date.end间隔开始的mxtemp ber经历了这样的表现。

  hkdata.2 
row.names houseID成员date.end date.begin Total.exposed.mxtemp
1 1 1 2008-01-07 2008-01-02 118.1
2 1 2 2008-01-06 2008-01-04 60

total.exposed.mxtemp是相应间隔内的mxtemp的总和(从date.begin到date.end)
即在行1,118.1 = 16 + 18.1 + 18.4 + 20.9 + 20.7 + 24



我的代码是这样..

 > cbind(hkdata.2,t(sapply(apply(hkdata.2,function,x)
+ weather.data [weather.data $ date> = x [6]&
+ weather.data $ date< = x [5],c(mxtemp)]),colSums)))

然后我得到这个错误.....:

  FUN中的错误(X [[1L]] ,...):
'x'必须是至少两维的数组

请任何专家请帮忙!

解决方案

这是一种可能性,导致您所描述的所需结果。我不知道这是否是100%dplyr惯用语,因为我正在处理两个不同的数据框架,但无论如何,它似乎工作。

 $($ d $)

hkdata.2 < - hkdata.2%>%
group_by(houseID,member)%>%
mutate (Totalmxtemp = sum(weather.data $ mxtemp [weather.data $ date> = date.begin&
weather.data $ date< = date.end]))

hkdata.2
#Source:本地数据框架[2 x 7]
#Groups:houseID,member

#row.names houseID成员男性date.end date.begin Totalmxtemp
#1 1 1 1 1 2008-01-07 2008-01-02 118.1
#2 2 1 2 0 2008-01-06 2008-01-04 60.0


I have thousands of entries in hkdata.2 and I want to create a loop that can help me to sum up the total exposed mxtemp from another data frame data.1 for each member in each houseID in data.2

Could any expert give me a hand on this?

weather.data
date   mpressure mxtemp     
1   2008-01-01  1025.3  15.7        
2   2008-01-02  1025.6  16.0        
3   2008-01-03  1023.6  18.1        
4   2008-01-04  1021.8  18.4        
5   2008-01-05  1020.1  20.9        
6   2008-01-06  1019.7  20.7        
7   2008-01-07  1018.4  24.0        
8   2008-01-08  1016.7  23.7

hkdata.2
row.names   houseID member  male       date.end date.begin 
1             1       1      1      2008-01-07  2008-01-02      
2             1       2      0      2008-01-06  2008-01-04

I want to get the sum of mxtemp from the date.begin and date.end interval of that same member experienced and show it like this.

hkdata.2
row.names   houseID member          date.end    date.begin  Total.exposed.mxtemp
1             1       1           2008-01-07    2008-01-02     118.1
2             1       2           2008-01-06    2008-01-04     60

total.exposed.mxtemp is the sum of mxtemp within the corresponding interval (which is from date.begin to date.end) ie. In row.names 1, 118.1 = 16+18.1+18.4+20.9+20.7+24

My codes are like this..

> cbind(hkdata.2, t(sapply(apply(hkdata.2, 1, function(x)
+   weather.data[weather.data$date >= x[6] &
+                  weather.data$date <= x[5], c("mxtemp")]), colSums)))

Then I got this error.....:

Error in FUN(X[[1L]], ...) : 
  'x' must be an array of at least two dimensions

Could any expert please help!!

解决方案

Here's one possibility that results in what you described as desired result. I'm not sure whether this is 100% dplyr idiomatic because I'm working on two different data.frames, but anyway, it seems to work.

library(dplyr)

hkdata.2 <- hkdata.2 %>%
  group_by(houseID, member) %>%
  mutate(Totalmxtemp = sum(weather.data$mxtemp[weather.data$date >= date.begin &
                                             weather.data$date <= date.end]))

hkdata.2
#Source: local data frame [2 x 7]
#Groups: houseID, member
#
#  row.names houseID member male   date.end date.begin Totalmxtemp
#1         1       1      1    1 2008-01-07 2008-01-02       118.1
#2         2       1      2    0 2008-01-06 2008-01-04        60.0

这篇关于在R中的另一个数据帧之间的时间间隔内对值进行求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆