根据R中的行差异对行进行分组 [英] Grouping rows on the basis of row differences in R

查看：86 发布时间：2020/11/21 1:09:25 r grouping

本文介绍了根据R中的行差异对行进行分组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一组动物，它们的采样间隔不同.我想做的是采样间隔符合特定条件的组和序列(例如，低于特定值).让我用一些虚拟数据进行说明:

I have a set of animal locations with different sampling intervals. What I want to do is group and seqences where the sampling interval matches a certain criteria (e.g. is below a certain value). Let me illustrate with some dummy data:

start <- Sys.time()
timediff <- c(rep(5,3),20,rep(5,2))
timediff <- cumsum(timediff)

# Set up a dataframe with a couple of time values
df <- data.frame(TimeDate = start + timediff)

# Calculate the time differences between the rows
df$TimeDiff <- c(as.integer(tail(df$TimeDate,-1) - head(df$TimeDate,-1)),NA)

# Define a criteria in order to form groups
df$TimeDiffSmall <- df$TimeDiff <= 5

             TimeDate TimeDiff TimeDiffSmall
1 2016-03-15 23:11:49        5          TRUE
2 2016-03-15 23:11:54        5          TRUE
3 2016-03-15 23:11:59       20         FALSE
4 2016-03-15 23:12:19        5          TRUE
5 2016-03-15 23:12:24        5          TRUE
6 2016-03-15 23:12:29       NA            NA

在该伪数据中，行1:3属于一组，因为它们之间的时间差是< = 5秒. 4-6属于第二组，但假设两个组之间可能有许多行不属于任何组(TimeDiffSmall等于FALSE).

In this dummy data, rows 1:3 belong to one group, since the time difference between them is <= 5 seconds. 4 - 6 belong to the second group, but hypothetically there could be a number of rows in between the two groups that dont belong to any group (TimeDiffSmall equals to FALSE).

结合来自两个多个SO答案的信息(例如第1部分)，我创建了一个可以解决此问题的函数问题.

Combining the information from two multiple SO answers (e.g. part 1), I've create a function that solves this problem.

number.groups <- function(input){
  # part 1: numbering successive TRUE values
  input[is.na(input)] <- F
  x.gr <- ifelse(x <- input == TRUE, cumsum(c(head(x, 1), tail(x, -1) - head(x, -1) == 1)),NA)
  # part 2: including last value into group
  items <- which(!is.na(x.gr))
  items.plus <- c(1,items+1)
  sel <- !(items.plus %in% items)
  sel.idx <- items.plus[sel]
  x.gr[sel.idx] <- x.gr[sel.idx-1]
  return(x.gr)


 # Apply the function to create groups
 df$Group <- number.groups(df$TimeDiffSmall)

             TimeDate TimeDiff TimeDiffSmall Group
1 2016-03-15 23:11:49        5          TRUE     1
2 2016-03-15 23:11:54        5          TRUE     1
3 2016-03-15 23:11:59       20         FALSE     1
4 2016-03-15 23:12:19        5          TRUE     2
5 2016-03-15 23:12:24        5          TRUE     2
6 2016-03-15 23:12:29       NA            NA     2

此功能实际上可以解决我的问题.这就是，这似乎是一种疯狂而新手的方式.有功能可以更专业地解决我的问题吗?

This function actually works to solve my problem. This this is, it seems like a crazy and rookie way to go about this. Is there a function that could solve my problem more professionally?

根据R中的行差异对行进行分组 [英] Grouping rows on the basis of row differences in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

根据R中的行差异对行进行分组 [英] Grouping rows on the basis of row differences in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭