选择标记R周围的行 [英] Select rows around a marker R

查看:75
本文介绍了选择标记R周围的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在相对较大的数据框中的标记之前和之后选择100行.标记是稀疏的,由于某种原因,我无法弄清楚或找到解决方案-这似乎不应该那么难,所以我可能会遗漏一些明显的东西.

I'm trying to select 100 rows before and after a marker in a relatively large dataframe. The markers are sparse and for some reason I haven't been able to figure it out or find a solution - this doesn't seem like it should be that hard, so I'm probably missing something obvious.

这是一个非常简单的数据示例:

Here's a very small simple example of what the data looks like:

timestamp talking_yn transition_yn
0.01      n          n
0.02      n          n
0.03      n          n
0.04      n          n
0.05      n          n
0.06      n          n
0.07      n          n
0.08      n          n
0.09      n          n
0.10      n          n
0.11      y          y
0.12      y          n
0.13      y          n
0.14      y          n
0.15      y          n
0.16      y          n
0.17      y          n
0.18      y          n

我尝试使用各种答案中的不同方法( zoo dplyr 中的 lag ),但它们都专注于选择一行或仅用标记子集替换那些行.对于虚拟示例数据,我如何选择 transition =='y'行之前和之后的5行?

I've tried using different methods from a variety of answers (lag from zoo or dplyr), but they all focus on selecting one row or subsetting only those rows with the marker. For the dummy example data, how would I select the 5 rows before and after the transition == 'y' row?

推荐答案

对此我有一个快速的功能:

I have a quick function for that:

#' Lead/Lag a logical
#'
#' @param lgl logical vector
#' @param bef integer, number of elements to lead by
#' @param aft integer, number of elements to lag by
#' @return logical, same length as 'lgl'
#' @export
leadlag <- function(lgl, bef = 1, aft = 1) {
  n <- length(lgl)
  bef <- min(n, max(0, bef))
  aft <- min(n, max(0, aft))
  befx <- if (bef > 0) sapply(seq_len(bef), function(b) c(tail(lgl, n = -b), rep(FALSE, b)))
  aftx <- if (aft > 0) sapply(seq_len(aft), function(a) c(rep(FALSE, a), head(lgl, n = -a)))
  rowSums(cbind(befx, lgl, aftx), na.rm = TRUE) > 0
}

dat[leadlag(dat$transition_yn == 'y', 2, 4),]
#    timestamp talking_yn transition_yn
# 9       0.09          n             n
# 10      0.10          n             n
# 11      0.11          y             y
# 12      0.12          y             n
# 13      0.13          y             n
# 14      0.14          y             n
# 15      0.15          y             n


数据

dat <- read.table(header=TRUE, stringsAsFactor=FALSE, text="
timestamp talking_yn transition_yn
0.01      n          n
0.02      n          n
0.03      n          n
0.04      n          n
0.05      n          n
0.06      n          n
0.07      n          n
0.08      n          n
0.09      n          n
0.10      n          n
0.11      y          y
0.12      y          n
0.13      y          n
0.14      y          n
0.15      y          n
0.16      y          n
0.17      y          n
0.18      y          n")

这篇关于选择标记R周围的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆