在R中使用rle()函数后如何提取日期 [英] How to extract dates after using rle() function in R

查看:91
本文介绍了在R中使用rle()函数后如何提取日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据文件.这是四列(YY,MM,DD,RR)的每日降雨数据.抱歉,这是我可以生成的最小数据.

I have the following data file. This is a daily data of rainfall with four columns (YY,MM,DD,RR). Apologies, this is the smallest data that I can generate.

dat<-structure(list(YY = c(1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1980L, 1980L, 
1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
1980L, 1980L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 
1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 
1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 
1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1982L, 1982L, 1982L, 
1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
1982L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 
1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 
1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 
1983L, 1983L, 1983L, 1983L, 1983L, 1984L, 1984L, 1984L, 1984L, 
1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 
1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 
1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 
1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 
1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 
1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 
1985L, 1985L, 1985L, 1985L, 1986L, 1986L, 1986L, 1986L, 1986L, 
1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 
1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 
1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1987L, 
1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 
1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 
1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 1987L, 
1987L, 1987L, 1987L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 
1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 
1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 
1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1988L, 1989L, 1989L, 
1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 
1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 
1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 1989L, 
1989L, 1989L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1990L, 1990L, 1990L, 1990L), MM = c(10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L), DD = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 
21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 
17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 
30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 
13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 
26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 
22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 
18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 
31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 
14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 
27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 
28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 
24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
29L, 30L, 31L), RR = c(65.3, 2.6, 3.8, 93.5, 0, 0, 1, 0, 0, 0, 
20.6, 3.4, 0, 0, 0, 0, 0, 0, 29.2, 6.6, 0.5, 1, 18.5, 3.3, 0, 
2, 2.8, 0, 0, 12.8, 15.8, 0, 1.6, 0.3, 25, 3.6, 0, 19.6, 1.8, 
0, 0, 0, 0, 25.9, 0, 6.6, 15.7, 1.3, 30, 0, 0, 0, 0.5, 17, 14.7, 
6.1, 0.5, 67.6, 133.8, 114.6, 0, 0, 0, 25.1, 1.8, 1.4, 0, 0, 
0, 0, 0, 2, 55.7, 36.9, 65, 3.1, 0, 0, 0, 0, 0, 0, 0, 41.4, 13.2, 
29.8, 1, 170.7, 115, 72.9, 25.9, 18, 15.7, 1.3, 4, 0, 0, 0, 8.9, 
0, 1, 0, 0, 6.1, 9.4, 21.6, 11, 13.1, 2.3, 0, 11.3, 4.9, 0.8, 
0, 0.3, 0, 0, 0, 0.3, 6.4, 0, 0, 2.3, 6.9, 21.4, 57.7, 0.3, 35.5, 
2.3, 0, 8.1, 0, 0, 108.2, 7.1, 8.2, 9.4, 14.2, 42.7, 0, 0, 1.3, 
0, 45.2, 203.2, 14.5, 9.2, 22.2, 2.9, 0, 59.5, 19.9, 162.9, 0, 
1, 0, 0, 0, 1.8, 9.9, 54, 8.1, 9.4, 0, 0.8, 0, 0, 0, 0, 0.3, 
0, 0, 7.4, 28.5, 101.2, 87.9, 26.9, 128.9, 78.1, 3.6, 0, 0, 77.5, 
130.1, 262.5, 105.5, 14.9, 0, 0, 0, 2.5, 2.3, 15.5, 28.8, 4.6, 
34.8, 22.1, 60.5, 0.5, 24.1, 0, 0, 37, 128.3, 26, 27.8, 2.3, 
2.3, 195.9, 227.7, 0.8, 2.8, 0, 0, 5.1, 0.3, 0, 0, 0, 0, 9.7, 
125.9, 64.9, 78.3, 3.3, 0, 0, 46.2, 52.6, 3, 24.9, 22.1, 14.2, 
0, 0, 170.6, 64.5, 30.3, 35.8, 204.5, 5.3, 0.5, 3.1, 0, 0, 17.2, 
136.6, 37.9, 0.8, 0, 0.3, 5.1, 2.4, 0, 9.8, 0, 0, 0, 0.2, 0, 
0.3, 0, 0.3, 1.5, 1, 0, 0, 0, 0, 0, 0.8, 1.3, 7.8, 0, 12, 25.2, 
74.3, 26.5, 1.6, 11.2, 0, 0, 0, 0, 5.4, 186.1, 99.7, 46.3, 2.8, 
7.6, 5.6, 22.9, 81, 2, 0, 7.1, 24, 68, 121.8, 10.4, 0, 24.4, 
77.1, 18, 8.9, 0.8, 0, 18.6, 0, 2, 1.3, 0, 18.8, 0, 0, 8.6, 5.6, 
0, 0.5, 61.8, 146.6, 16.5, 0, 18.6, 0, 0, 0, 8.1, 59.4, 8.5, 
1, 54.8, 0, 21.6, 0, 0, 0, 0, 0, 0, 0, 44.2, 0.5, 0, 1.3, 0, 
1.8, 1, 0, 9.7, 93.5, 48.5, 158.3, 78.5, 2.8, 4.1, 13, 98.8, 
55.2, 76.3, 56.3, 0, 6.1, 0, 0, 0, 0, 5.3, 0, 14.5, 0, 0)), row.names =             
c(274L, 
275L, 276L, 277L, 278L, 279L, 280L, 281L, 282L, 283L, 284L, 285L, 
286L, 287L, 288L, 289L, 290L, 291L, 292L, 293L, 294L, 295L, 296L, 
297L, 298L, 299L, 300L, 301L, 302L, 303L, 304L, 640L, 641L, 642L, 
643L, 644L, 645L, 646L, 647L, 648L, 649L, 650L, 651L, 652L, 653L, 
654L, 655L, 656L, 657L, 658L, 659L, 660L, 661L, 662L, 663L, 664L, 
665L, 666L, 667L, 668L, 669L, 670L, 1005L, 1006L, 1007L, 1008L, 
1009L, 1010L, 1011L, 1012L, 1013L, 1014L, 1015L, 1016L, 1017L, 
1018L, 1019L, 1020L, 1021L, 1022L, 1023L, 1024L, 1025L, 1026L, 
1027L, 1028L, 1029L, 1030L, 1031L, 1032L, 1033L, 1034L, 1035L, 
1370L, 1371L, 1372L, 1373L, 1374L, 1375L, 1376L, 1377L, 1378L, 
1379L, 1380L, 1381L, 1382L, 1383L, 1384L, 1385L, 1386L, 1387L, 
1388L, 1389L, 1390L, 1391L, 1392L, 1393L, 1394L, 1395L, 1396L, 
1397L, 1398L, 1399L, 1400L, 1735L, 1736L, 1737L, 1738L, 1739L, 
1740L, 1741L, 1742L, 1743L, 1744L, 1745L, 1746L, 1747L, 1748L, 
1749L, 1750L, 1751L, 1752L, 1753L, 1754L, 1755L, 1756L, 1757L, 
1758L, 1759L, 1760L, 1761L, 1762L, 1763L, 1764L, 1765L, 2101L, 
2102L, 2103L, 2104L, 2105L, 2106L, 2107L, 2108L, 2109L, 2110L, 
2111L, 2112L, 2113L, 2114L, 2115L, 2116L, 2117L, 2118L, 2119L, 
2120L, 2121L, 2122L, 2123L, 2124L, 2125L, 2126L, 2127L, 2128L, 
2129L, 2130L, 2131L, 2466L, 2467L, 2468L, 2469L, 2470L, 2471L, 
2472L, 2473L, 2474L, 2475L, 2476L, 2477L, 2478L, 2479L, 2480L, 
2481L, 2482L, 2483L, 2484L, 2485L, 2486L, 2487L, 2488L, 2489L, 
2490L, 2491L, 2492L, 2493L, 2494L, 2495L, 2496L, 2831L, 2832L, 
2833L, 2834L, 2835L, 2836L, 2837L, 2838L, 2839L, 2840L, 2841L, 
2842L, 2843L, 2844L, 2845L, 2846L, 2847L, 2848L, 2849L, 2850L, 
2851L, 2852L, 2853L, 2854L, 2855L, 2856L, 2857L, 2858L, 2859L, 
2860L, 2861L, 3196L, 3197L, 3198L, 3199L, 3200L, 3201L, 3202L, 
3203L, 3204L, 3205L, 3206L, 3207L, 3208L, 3209L, 3210L, 3211L, 
3212L, 3213L, 3214L, 3215L, 3216L, 3217L, 3218L, 3219L, 3220L, 
3221L, 3222L, 3223L, 3224L, 3225L, 3226L, 3562L, 3563L, 3564L, 
3565L, 3566L, 3567L, 3568L, 3569L, 3570L, 3571L, 3572L, 3573L, 
3574L, 3575L, 3576L, 3577L, 3578L, 3579L, 3580L, 3581L, 3582L, 
3583L, 3584L, 3585L, 3586L, 3587L, 3588L, 3589L, 3590L, 3591L, 
3592L, 3927L, 3928L, 3929L, 3930L, 3931L, 3932L, 3933L, 3934L, 
3935L, 3936L, 3937L, 3938L, 3939L, 3940L, 3941L, 3942L, 3943L, 
3944L, 3945L, 3946L, 3947L, 3948L, 3949L, 3950L, 3951L, 3952L, 
3953L, 3954L, 3955L, 3956L, 3957L, 4292L, 4293L, 4294L, 4295L, 
4296L, 4297L, 4298L, 4299L, 4300L, 4301L, 4302L, 4303L, 4304L, 
4305L, 4306L, 4307L, 4308L, 4309L, 4310L, 4311L, 4312L, 4313L, 
4314L, 4315L, 4316L, 4317L, 4318L, 4319L, 4320L, 4321L, 4322L
), class = "data.frame")

我正在计算有多少事件具有仅1天的持续时间,2天连续的持续时间,3天连续的持续时间.然后绘制它的直方图.

I am counting how many events have 1 day only duration, 2 day consecutive duration, 3 day consecutive duration. Then plot the histogram of this.

我能够绘制直方图:

library(dplyr)
dat2 <- dat %>%
   group_by(YY,MM) %>%
   mutate(extreme = RR > quantile(RR,0.95,na.rm=TRUE))

result <- rle(dat2$extreme)
hist(result$lengths[result$values],breaks = c(0:5), xlab = "Length of extreme events", main = "")

我想要什么:

[1]我想为每个长度(连续1或2天,等等)提取日期(并将日期保存到单独的文件中).我不确定在应用rle()函数后如何过滤日期.我将把它应用于具有不同长度的多个文件.

[1] I want to extract the dates (and save the dates into separate files) for each length (1 or 2 consecutive days, etc). I am not sure how to filter the dates after applying the rle() function. I will be applying this to multiple files with different lengths.

我将不胜感激.

推荐答案

您可以使用 data.table 中的 rleid 来查找连续出现的日期,这些日期是极端,计算它们的出现次数,然后将数据分成数据帧列表.

You can use rleid from data.table to find consecutive occurrence of dates which are extreme, count their occurences and split the data into list of dataframes.

library(dplyr)

data <- dat %>%
          group_by(YY,MM) %>%
          mutate(extreme = RR > quantile(RR,0.95,na.rm=TRUE), 
                 grp = data.table::rleid(extreme)) %>%
           filter(extreme) %>%
           add_count(grp) %>%
           ungroup %>%
           select(-extreme, -grp) %>%
           group_split(n)

具有一个连续的 extreme

data[[1]]
# A tibble: 12 x 5
#      YY    MM    DD    RR     n
#   <int> <int> <int> <dbl> <int>
# 1  1979    10     1  65.3     1
# 2  1979    10     4  93.5     1
# 3  1982    10    13  21.6     1
# 4  1982    10    15  13.1     1
# 5  1983    10    21 203.      1
# 6  1983    10    29 163.      1
# 7  1986    10    19 171.      1
# 8  1986    10    23 204.      1
# 9  1988    10     7 186.      1
#10  1988    10    20 122.      1
#11  1990    10    12 158.      1
#12  1990    10    17  98.8     1

具有两个连续的 extreme

data[[2]]
# A tibble: 12 x 5
#      YY    MM    DD    RR     n
#   <int> <int> <int> <dbl> <int>
# 1  1980    10    28 134.      2
# 2  1980    10    29 115.      2
# 3  1981    10    26 171.      2
# 4  1981    10    27 115       2
# 5  1984    10    29 130.      2
# 6  1984    10    30 262.      2
# 7  1985    10    23 196.      2
# 8  1985    10    24 228.      2
# 9  1987    10    29  74.3     2
#10  1987    10    30  26.5     2
#11  1989    10    10  61.8     2
#12  1989    10    11 147.      2

这篇关于在R中使用rle()函数后如何提取日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆