为数据框中的因素添加重复索引 [英] Adding an repeated index for factors in data frame
问题描述
我有一个要在其中添加索引的数据框,例如我数据框中的每个因子1 ... n。这是一些虚拟数据的示例。
I have a data frame in which I want to add an index e.g. 1...n for each factor in my data frame. Here is an example with some dummy data.
factor
a
a
a
a
a
b
b
b
b
b
c
c
c
c
我想添加一个额外的列,该列为每个因子分别向n添加一个索引1。数据格式如下:
I would like to add an additional column which adds an index 1 to n for each factor separately. The resulant data frame would look like:
factor index
a 1
a 2
a 3
a 4
a 5
b 1
b 2
b 3
b 4
b 5
c 1
c 2
c 3
c 4
有人可以解释如何做吗?
Can anyone explain how to do so? Thanks in advance.
推荐答案
一种方法是:
unlist(lapply(split(x, x), seq_along))
其中 x
是您作为向量的因子。
where x
is your factor as a vector.
R> x <- factor(rep(letters[1:3], times = c(5,5,4))) ## your data
R> data.frame(factor = x, index = unlist(lapply(split(x, x), seq_along),
+ use.names = FALSE))
factor index
1 a 1
2 a 2
3 a 3
4 a 4
5 a 5
6 b 1
7 b 2
8 b 3
9 b 4
10 b 5
11 c 1
12 c 2
13 c 3
14 c 4
另一种类似主题的方法是使用 table()
和 seq_len()
:
Another way, on a similar theme is to use table()
and seq_len()
:
unlist(sapply(table(x), seq_len), use.names = FALSE)
另一种方法是通过<$ c $使用游程编码c> rle():
R> rle(as.character(x))$lengths
[1] 5 5 4
我们可以将其插入 sapply()
代码而不是 table()
调用中:
which we can plug into the sapply()
code instead of the table()
call:
R> unlist(sapply(rle(as.character(x))$lengths, seq_len), use.names = FALSE)
[1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4
这篇关于为数据框中的因素添加重复索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!