将年月字符串转换为带有间隔的三个月垃圾箱-如何分配连续的升序值? [英] Convert year-month string to three month bins with gaps - how to assign contiguous ascending values?
问题描述
我已使用下面的代码将year.month字符串 bin到三个月的bin中。问题是我希望每个垃圾箱都有一个数字,该数字对应于垃圾箱按时间顺序出现的位置(即第一个垃圾箱= 1,第二垃圾箱= 2,依此类推)。现在,第一个月的垃圾箱已分配给数字4,但我不确定为什么。任何帮助将不胜感激!
I have used the code below to "bin" a year.month string into three month bins. The problem is that I want each of the bins to have a number that corresponds where the bin occurs chronologically (i.e. first bin =1, second bin=2, etc.). Right now, the first month bin is assigned to the number 4, and I am not sure why. Any help would be highly appreciated!
> head(Master.feed.parts.gn$yr.mo, n=20)
[1] "2007.10" "2007.10" "2007.10" "2007.11" "2007.11" "2007.11" "2007.11" "2007.12" "2008.01"
[10] "2008.01" "2008.01" "2008.01" "2008.01" "2008.02" "2008.03" "2008.03" "2008.03" "2008.04"
[19] "2008.04" "2008.04"
>
> yearmonth_to_integer <- function(xx) {
+ yy_mm <- as.integer(unlist(strsplit(xx, '.', fixed=T)))
+ return( (yy_mm[1] - 2006) + (yy_mm[2] %/% 3) )
+ }
>
> Cluster.GN <- sapply(Master.feed.parts.gn$yr.mo, yearmonth_to_integer)
> Cluster.GN
2007.10 2007.10 2007.10 2007.11 2007.11 2007.11 2007.11 2007.12 2008.01 2008.01 2008.01
4 4 4 4 4 4 4 5 2 2 2
2008.01 2008.01 2008.02 2008.03 2008.03 2008.03 2008.04 2008.04 2008.04 2008.04 2008.05
2 2 2 3 3 3 3 3 3 3 3
2008.05 2008.05 2008.06 2008.10 2008.11 2008.11 2008.12 <NA> 2009.05 2009.05 2009.05
3 3 4 5 5 5 6 NA 4 4 4
2009.06 2009.07 2009.07 2009.07 2009.09 2009.10 2009.11 2010.01 2010.02 2010.02 2010.02
5 5 5 5 6 6 6 4 4 4 4
更新:
我被要求提供示例输入(年)和所需的输出(Cluster.GN)。我有一个年月字符串,每个月的观察值数量不同,有些月份没有任何观察值。我想做的是对具有数据的连续三个月中的每个月进行bin,为每个三个月 bin分配一个数字,如下所示。
I was asked to provide sample input (year) and the desired output (Cluster.GN).I have a year-month string that has varying numbers of observations for each month, and some months don't have any observations. What I want to do is bin each of the three consecutive months that have data, assigning each three month "bin" a number as shown below.
yr.mo Cluster.GN
1 2007.10 1
2 2007.10 1
3 2007.10 1
4 2007.10 1
5 2007.10 1
6 2007.11 1
7 2007.11 1
8 2007.11 1
9 2007.11 1
10 2007.12 1
11 2007.12 1
12 2007.12 1
13 2007.12 1
14 2008.10 2
15 2008.10 2
16 2008.10 2
17 2008.10 2
18 2008.12 2
19 2008.12 2
20 2008.12 2
21 2008.12 2
22 2008.12 2
推荐答案
1)将字符串转换为动物园的 yearqtr
类,然后转换为整数:
1) Convert the strings to zoo's "yearqtr"
class and then to integers:
s <- c("2007.10", "2007.10", "2007.10", "2007.11", "2007.11", "2007.11",
"2007.11", "2007.12", "2008.01", "2008.01", "2008.01", "2008.01",
"2008.01", "2008.02", "2008.03", "2008.03", "2008.03", "2008.04",
"2008.04", "2008.04")
library(zoo)
yq <- as.yearqtr(s, "%Y.%m")
as.numeric(factor(yq))
## [1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3
最后一行可能是: 4 *(yq-yq [1])+ 1
请注意,问题2007.12与2007.10和2007.11位于不同的季度;但是,它们都在同一季度,因此我们假设您没有打算这样做。
Note that in the question 2007.12 is classified as in a different quarter than 2007.10 and 2007.11; however, they are all in the same quarter and we assume you did not intend this.
2) :
f <- factor(s)
nlev <- nlevels(f)
levels(f) <- gl(nlev, 3, nlev)
f
## [1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3
## Levels: 1 2 3
如果缺少月份,那么给出的答案将不同于(1)因此,一切都取决于您要寻找的东西。
IF there are missing months then this will give a different answer than (1) so it all depends on what you are looking for.
这篇关于将年月字符串转换为带有间隔的三个月垃圾箱-如何分配连续的升序值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!