for循环在不规则的时间序列 [英] for loop in irregular time series

查看:157
本文介绍了for循环在不规则的时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找关于如何循环遍历以下内容的建议,这是一个更大的数据集的一个子集。我希望以下代表作品。

I'm looking for advice on how to loop through the following, which is a subset of a much larger data set. I hope the following representation works.

mydf <- structure(list(site_id = c("39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", "39ADA00070", 
"39ADA00070", "39ADA00070", "39ALL00184", "39ALL00184", "39ALL00184", 
"39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", 
"39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", 
"39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", 
"39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184", "39ALL00184"
), date = structure(c(6339, 8594, 9293, 9441, 10014, 10604, 11080, 
11821, 12717, 12907, 13081, 13277, 13459, 13635, 13822, 14012, 
14207, 14207, 14355, 14564, 14704, 14917, 15105, 15271, 15478, 
15644, 15833, 15834, 16009, 16203, 7783, 8406, 8554, 8686, 9034, 
9260, 9632, 9777, 10002, 10491, 10491, 11060, 11585, 12145, 12145, 
12696, 13242, 13242, 13775, 14363, 14881, 15428, 15974), class = "Date"), 
    var1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 159L, 148L, 
    149L, 134L, 179L, 205L, 193L, 109L, 109L, 177L, 75L, 272L, 
    150L, 115L, 232L, 230L, 183L, 159L, 159L, 304L, 220L, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    -98L, -98L, -38L, -74L, -74L, -80L, -48L), var2 = c(NA, NA, 
    NA, NA, NA, NA, NA, NA, 16.8, 16.8, 14.5, 14.2, 15.1, 14.5, 
    15, 15.2, 13.2, 13.2, 15, 15.2, 15.1, 14.4, 14.8, 15.2, 16.3, 
    NA, 14.3, 14.3, 15.6, 14.8, NA, 12, 14.7, NA, 14.6, NA, 13.7, 
    12.3, 12.5, 13.5, 13.5, 12.5, 13.1, 14.2, 14.2, 14.1, 12.5, 
    12.5, 13.5, 12.7, 12.6, 12.5, 12.6), var3 = c(NA, NA, NA, 
    NA, NA, NA, NA, NA, 7.35, 7.85, 7.5, 7.47, 7.62, 7.08, 7.08, 
    7.2, 7.4, 7.4, 7.26, 7.05, 6.56, 7.2, 7.42, 6.5, 7.81, 8.43, 
    7.57, 7.57, 7.42, 7.72, NA, 6.58, 6.8, NA, 7.75, NA, 7.06, 
    6.77, 6.41, 6.84, 6.84, 7.85, 7.13, 7.26, 7.26, 7.06, 7.14, 
    7.14, 7.11, 6.9, 7.11, 7.2, 7.1), var4 = c(NA, 283L, 216L, 
    223L, 256L, 165L, 192L, 216L, 173L, 216L, 179L, 282L, 146L, 
    227L, 141L, 210L, 160L, 162L, 157L, 140L, 235L, 166L, 216L, 
    NA, 162L, 193L, 286L, 274L, 163L, 209L, NA, 304L, 321L, 293L, 
    398L, 302L, 301L, 282L, 288L, 292L, 292L, 302L, 515L, 309L, 
    309L, 323L, 338L, 295L, 280L, 279L, 325L, 328L, 322L), var5 = c(NA, 
    NA, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), var6 = c(NA, NA, 
    29L, 32L, 36L, 24L, 25L, 29L, 27L, 27L, 24L, 32L, 21L, 27L, 
    21L, 26L, 23L, 24L, 25L, 20L, 24L, 22L, 28L, 24L, 20L, 23L, 
    30L, 29L, 21L, 24L, 15L, 15L, 18L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
    15L, 15L, 15L), var7 = c(NA, NA, 77, 83, 87, 66, 73, 73, 
    65, 76, 69, 93, 60, 76, 56, 77, 67, 68, 68, 60, 67, 63, 82, 
    69, 56, 68, 85, 83, 59, 68.2, 157, 159, 164, 169, 155, 176, 
    156, 156, 162, 162, 162, 160, 180, 163, 163, 158, 168, 171, 
    162, 167, 177, 167, 168), var8 = c(NA, NA, 25, 26, 29, 21, 
    22, 23, 20, 23, 21, 30, 17, 24, 16, 23, 20, 20, 21, 17, 23, 
    18, 25, 20, 17, 21, 27, 27, 17, 20.9, 91, 89, 96, 92, 86, 
    100, 89, 91, 92, 94, 94, 91, 97, 91, 91, 92, 98, 99, 94, 
    100, 106, 98, 100), var9 = c(1.02, 1, 0.37, 0.48, 0.88, 0.16, 
    0.17, 0.24, 0.25, 5.98, 0.26, 0.54, 0, 0.19, 0, 0.18, 0.14, 
    0.13, 0.16, 0.11, 0.19, 0.16, 0.26, NA, 0.11, 0.27, 0.19, 
    0.19, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, NA, 0.1, 0.1, 0.1, 
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0.1, 0.1, 
    0.1), var10 = c(50, 48, 64, 55, 52, 64, 69, 63.3, 56.1, 40.6, 
    58.6, 43.9, 62.2, 51.9, 55.6, 53.4, 61.3, 61, 61.1, 61.9, 
    51.5, 60.7, 52.2, NA, 66, 52.8, 46.8, 47.5, 59.2, 53.4, NA, 
    560, 650, 540, 548, 655, 565, 531, 540, 501, 501, 531, 535, 
    547, 547, 492, 537, 542, 512, 542, 548, 581, 540)), class = "data.frame", row.names = c(NA, 
-53L), .Names = c("site_id", "date", "var1", "var2", "var3", 
"var4", "var5", "var6", "var7", "var8", "var9", "var10"))

这个 data.frame 是一组不规则的时间序列,使用 site_id 作为主要ID因子, date 作为日期,然后10个变量。实际的 data.frame 有数百个ID和几十个因素。

This data.frame is a set of irregular time series using site_id as the main ID factor, date as the date, and then 10 variables. The actual data.frame has hundreds of IDs and dozens of factors.

我知道我可以使用 site_id 访问每个时间序列,例如

I know I can access each time series by site_id using, for example

mydf[mydf$site_id == '39ADA00070', ][,3]

获取第一个 site_id var1

我正在寻找的是一个稳健的,用于循环运行数据。框架

What I am looking for is a robust for loop to run through the data.frame:

for (i in 1:length(site_id)){

  perform something on 
      var1 through var10

  output matrix of that something
}

某些东西将是任意数量的测试或图,例如

That something would be any number of tests or plots, e.g.

GetOutliers()(来自 extremevalues 包)
各种图,从 ggplot2
等,等等。

GetOutliers() (from the extremevalues package) various plots, from ggplot2 etc, etc.

但是,首先,我只需要帮助获取循环索引的的赋值。

But first, I just need help getting the assignment of the for loop indices correct.

我不反对使用申请 ddply )这项工作的工具,但是我想从循环的基本开始。然后我可以通过解决 NA s,审查值等来清理。

I am not against using apply (ddply) tools for this work, but I wanted to start with a basic for loop. Then I can clean up by addressing NAs, censored values, etc.

非常感谢! >

Thanks so much!

推荐答案

尝试以下操作:

for(ss in unique(mydf$site_id)){
    for(cc in 3:12){
        # do whatever function
        print(max(mydf[mydf$site_id == ss, cc],na.rm=TRUE))
    }
}

[1] 304
[1] 16.8
[1] 8.43
[1] 286
[1] 2
[1] 36
[1] 93
[1] 30
[1] 5.98
[1] 69
[1] -38
[1] 14.7
[1] 7.85
[1] 515
[1] 2
[1] 18
[1] 180
[1] 106
[1] 0.1
[1] 655

这篇关于for循环在不规则的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆