计数一个因素级别内的记录 [英] count of records within levels of a factor

查看:119
本文介绍了计数一个因素级别内的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试用1到n之间的连续数字填充表格中的字段(或完全创建单独的向量,取其中更简单),其中n是共享相同因子级别的记录的总数,然后返回到下一级的1等。也就是说,对于这样的表

  data< -matrix ('A',4),rep('B',3),rep('C',4),rep('D',2)),ncol = 1)



结果应该是一个新列(例如sample),如下所示:

  sample< -c(1,2,3,4,1,2,3,1,2,3,4,1,2)


解决方案

您可以使用 rle 函数与 lapply

  sample<  -  unlist(lapply(rle(data [,1])$ ​​lengths,FUN = function(x){ 1:x}))

data < - cbind(data,sample)

或更好,您可以结合 rle

),请填写 =nofollow>

  data <-cbind(data,sequence(rle(data [,1])$ ​​lengths))

& data
[,1] [,2]
[1,]A1
[2,]A2
[3, A3
[4,]A4
[5,]B1
[6,]B2
[7,]B3
[8,]C1
[9, 3
[11,]C4
[12,]D1
[13, / code>


I am trying to populate a field in a table (or create a separate vector altogether, whichever is easier) with consecutive numbers from 1 to n, where n is the total number of records that share the same factor level, and then back to 1 for the next level, etc. That is, for a table like this

data<-matrix(c(rep('A',4),rep('B',3),rep('C',4),rep('D',2)),ncol=1)

the result should be a new column (e.g. "sample") as follows:

sample<-c(1,2,3,4,1,2,3,1,2,3,4,1,2)

解决方案

You can use rle function together with lapply :

sample <- unlist(lapply(rle(data[,1])$lengths,FUN=function(x){1:x}))

data <- cbind(data,sample)

Or even better, you can combine rle and sequence in the following one-liner (thanks to @Arun suggestion)

data <- cbind(data,sequence(rle(data[,1])$lengths))

> data
      [,1] [,2]
 [1,] "A"  "1" 
 [2,] "A"  "2" 
 [3,] "A"  "3" 
 [4,] "A"  "4" 
 [5,] "B"  "1" 
 [6,] "B"  "2" 
 [7,] "B"  "3" 
 [8,] "C"  "1" 
 [9,] "C"  "2" 
[10,] "C"  "3" 
[11,] "C"  "4" 
[12,] "D"  "1" 
[13,] "D"  "2" 

这篇关于计数一个因素级别内的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆