R中的嵌套循环计算平均日期 [英] Nested loops in R to calculate a mean day

查看:183
本文介绍了R中的嵌套循环计算平均日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我正在处理一个问题。尝试在R中重现一个公式。我刚刚在Mathematica中完成了这个代码,但现在我想在R中重现我的学生。这是一个聪明的方法来计算一年中的平均一天,称为代表日。这种方法描述如下



我的部分数据是:

  date temp Hour DayCount 
01 / 01/17 -2 0 1
01/01/17 -2 1 1
01/01/17 -2 2 1
01/01/17 -3 3 1
01/01/17 -4 4 1
01/01/17 -4 5 1
01/01/17 -5 6 1
01/01/17 -6 7 1
01/01/17 -4 8 1
01/01/17 -2 9 1
01/01/17 -1 10 1
01/01/17 0 11 1
01/01/17 1 12 1
01/01/17 2 13 1
01/01/17 1 14 1
01/01/17 -1 15 1
01/01/17 -2 16 1
01/01/17 -1 17 1
01/01/17 -2 18 1
01/01/17 -3 19 1
01/01/17 -2 20 1
01/01/17 -3 21 1
01/01/17 -2 22 1
01/01/17 - 1 23 1
02/01/17 -1 0 2
02/01/17 -1 1 2
02/01/17 -1 2 2
02/01 / 17 -1 3 2
02/01/17 -1 4 2
02/01/17 -1 5 2
02/01/17 -1 6 2
02/01/17 -1 7 2
02/01/17 -1 8 2
02/01/17 -1 9 2
02/01/17 0 10 2
02/01/17 0 11 2
02/01/17 1 12 2
02/01/17 1 13 2
02/01/17 1 14 2
02/01/17 1 15 2
02/01/17 1 16 2
02/01/17 1 17 2
02/01/17 -1 18 2
02/01/17 -3 19 2
02/01/17 -2 20 2
02/01/17 -2 21 2
02/01/17 -2 22 2
02/01/17 -1 23 2




所以我想重现这个公式:





其中 N 是时间段(现在为2)的天数,每个 cki ckj 是第k个小时的第i天的温度。
我所拥有的是一个对称矩阵,其对角线全部为零。
然后,我必须将所有的行加起来



以下是我的代码:

$ $ $ $ $ $ $ data $ DayCount< - as.factor(data $ DayCount)
datasplit< - split(data,data $ DayCount)#每天分割我的数据
distance = matrix()#创建一个空矩阵

for(k在1:24){
for(i in 1:2){
for(j in 1:2){


distance [i,j] =((datasplit [[i]] [k,2] -datasplit [[j]] [k,2])^ 2)
sum = sum(distance)
}
}
}

有什么建议吗?我知道你可以做到这一点。请帮助我!

解决方案

首先让我们创建一个数据框对象,这样我们可以操纵我们的数据很容易:

  df<  -  read.csv(stringsAsFactors = TRUE,text ='date,temp,Hour,DayCount 
01/01/17,-2,0,1
01/01/17,-2,1,1
01/01/17,-2,2,1
01/01/17,-3,3,1
01/01/17,-4,4,1
01/01/17,-4,5,1
01/01/17,-5,6,1
01/01/17,-6,7,1
01/01/17,-4,8,1
01 / 01/17,-2,9,1
01/01/17,-1,10,1
01/01/17,0,11,1
01/01/17 ,1,12,1
01/01/17,2,13,1
01/01/17,1,14,1
01/01/17,-1,15 ,1
01/01/17,-2,16,1
01/01/17,-1,17,1
01/01/17,-2,18,1
01/01/17,-3,19,1
01/01/17,-2,20,1
01/01/17,-3,21,1
01/01/17,-2,22,1
01/01/17,-1,23,1
02/01/17,-1,0,2
02/01/17,-1,1,2
02/01/17,-1,2,2
02/01/17,-1,3,2
02 / 01 / 17,-1,4,2
02/01/17,-1,5,2
02/01/17,-1,6,2
02/01/17, -1,7,2
02/01/17,-1,8,2
02/01/17,-1,9,2
02/01/17,0, 10,2
02/01/17,0,11,2
02/01/17,1,12,2
02/01/17,1,13,2
02/01/17,1,14,2
02/01/17,1,15,2
02/01/17,1,16,2
02/01 / 17,1,17,2
02/01/17,-1,18,2
02/01/17,-3,19,2
02/01/17, -2,20,2
02/01/17,-2,21,2
02/01/17,-2,22,2
02/01/17,-1 ,23,2')

现在我们试着按照您的指示行事,我不是试图实现这是最理想的方法,但尽量坚持原来的想法,所以我会使用几个嵌套循环:

 #得到不同的日子
天< - levels(df $ date)
#创建A矩阵,空
A< - 矩阵(nrow =长度(天) ($)
$ iterator
for(i in 1:length(days)){
for(j in 1:length(days)){
#获得每天可用的所有温度
ci < - df [df $ date == days [i],] $ temp
cj < - df [df $ date == days [j ],] $ temp
#更新A矩阵
A [i,j] < - sum((ci-cj)^ 2)
}
} $ b $ (1)长度(天),函数(i)和(A [i,])))



结果如下:

 > A 
[,1] [,2]
[1,] 0 97
[2,] 97 0

> Aj
[1] 97 97

这应该适用于任何天数,许多温度测量每一天,你想(不一定24)。

I'm working on a problem. Try to reproduce a formula in R. I've just done this code in Mathematica, but now I want to reproduce in in R for my students. This is a smart method to calculate a "mean day" in the year, called representative day. This method is described here.

My part of data is:

date    temp    Hour    DayCount
01/01/17    -2  0   1
01/01/17    -2  1   1
01/01/17    -2  2   1
01/01/17    -3  3   1
01/01/17    -4  4   1
01/01/17    -4  5   1
01/01/17    -5  6   1
01/01/17    -6  7   1
01/01/17    -4  8   1
01/01/17    -2  9   1
01/01/17    -1  10  1
01/01/17    0   11  1
01/01/17    1   12  1
01/01/17    2   13  1
01/01/17    1   14  1
01/01/17    -1  15  1
01/01/17    -2  16  1
01/01/17    -1  17  1
01/01/17    -2  18  1
01/01/17    -3  19  1
01/01/17    -2  20  1
01/01/17    -3  21  1
01/01/17    -2  22  1
01/01/17    -1  23  1
02/01/17    -1  0   2
02/01/17    -1  1   2
02/01/17    -1  2   2
02/01/17    -1  3   2
02/01/17    -1  4   2
02/01/17    -1  5   2
02/01/17    -1  6   2
02/01/17    -1  7   2
02/01/17    -1  8   2
02/01/17    -1  9   2
02/01/17    0   10  2
02/01/17    0   11  2
02/01/17    1   12  2
02/01/17    1   13  2
02/01/17    1   14  2
02/01/17    1   15  2
02/01/17    1   16  2
02/01/17    1   17  2
02/01/17    -1  18  2
02/01/17    -3  19  2
02/01/17    -2  20  2
02/01/17    -2  21  2
02/01/17    -2  22  2
02/01/17    -1  23  2

So I want to reproduce this formula:

where N is the number of days in the time period (now 2) and every cki and ckj are the temperatures of the ith day at the kth hour. What I have is a symmetrical matrix with all zero in the diagonal. Then I have to sum all the row

Here is my code:

 data$DayCount <- as.factor(data$DayCount)
 datasplit <- split(data, data$DayCount) #Split my data for each day
 distance=matrix() #Create an empty matrix

 for (k in 1:24) {
 for (i in 1:2) {
   for (j in 1:2) {


distance[i,j]= ((datasplit[[i]][k,2]-datasplit[[j]][k,2])^2)
sum=sum(distance)
            }
   }
 }

Any suggestions? I know that you are able to do it.. Help me please!

解决方案

First let's create a dataframe object so we can manipulate our data easily:

df <- read.csv(stringsAsFactors = TRUE, text = 'date, temp, Hour, DayCount
01/01/17, -2, 0 , 1
01/01/17, -2, 1 , 1
01/01/17, -2, 2 , 1
01/01/17, -3, 3 , 1
01/01/17, -4, 4 , 1
01/01/17, -4, 5 , 1
01/01/17, -5, 6 , 1
01/01/17, -6, 7 , 1
01/01/17, -4, 8 , 1
01/01/17, -2, 9 , 1
01/01/17, -1, 10, 1
01/01/17, 0 , 11, 1
01/01/17, 1 , 12, 1
01/01/17, 2 , 13, 1
01/01/17, 1 , 14, 1
01/01/17, -1, 15, 1
01/01/17, -2, 16, 1
01/01/17, -1, 17, 1
01/01/17, -2, 18, 1
01/01/17, -3, 19, 1
01/01/17, -2, 20, 1
01/01/17, -3, 21, 1
01/01/17, -2, 22, 1
01/01/17, -1, 23, 1
02/01/17, -1, 0 , 2
02/01/17, -1, 1 , 2
02/01/17, -1, 2 , 2
02/01/17, -1, 3 , 2
02/01/17, -1, 4 , 2
02/01/17, -1, 5 , 2
02/01/17, -1, 6 , 2
02/01/17, -1, 7 , 2
02/01/17, -1, 8 , 2
02/01/17, -1, 9 , 2
02/01/17, 0 , 10, 2
02/01/17, 0 , 11, 2
02/01/17, 1 , 12, 2
02/01/17, 1 , 13, 2
02/01/17, 1 , 14, 2
02/01/17, 1 , 15, 2
02/01/17, 1 , 16, 2
02/01/17, 1 , 17, 2
02/01/17, -1, 18, 2
02/01/17, -3, 19, 2
02/01/17, -2, 20, 2
02/01/17, -2, 21, 2
02/01/17, -2, 22, 2
02/01/17, -1, 23, 2')

Now let's try to follow your indications, I'm not attempting to achieve this the most optimal way but to try to stick to your original idea as much as possible, so I'll use a couple of nested loops:

# get the different days
days <- levels(df$date)
# create the A matrix, empty
A <- matrix(nrow = length(days), ncol = length(days))
# iterate
for(i in 1:length(days)) {
  for(j in 1:length(days)) {
    # get all the temperatures available for each day
    ci <- df[df$date == days[i],]$temp
    cj <- df[df$date == days[j],]$temp
    # update the A matrix
    A[i, j] <- sum((ci - cj)^2)
  }
}
# finally the last sum
Aj <- unlist(lapply(1:length(days), function(i) sum(A[i, ])))

The results are:

> A
     [,1] [,2]
[1,]    0   97
[2,]   97    0

> Aj
[1] 97 97

This should work for any number of days and for as many temperature measurements per day as you want (not necessarily 24).

这篇关于R中的嵌套循环计算平均日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆