根据上一行的值选择特定的行(在同一列中) [英] Select specific rows based on previous row value (in the same column)

查看:126
本文介绍了根据上一行的值选择特定的行(在同一列中)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试找到一种通过R编写脚本的方法,但无法实现.我有一个像这样的数据集:

I've been trying to figure a way to script this through R, but just can't get it. I have a dataset like this:

Trial  Type Correct Latency     
1       55  0       0
3       30  1       766
4       10  1       344
6       40  1       716
7       10  1       326
9       30  1       550
10      10  1       350
11      64  0       0
13      30  1       683
14      10  1       270
16      30  1       666
17      10  1       297
19      40  1       616
20      10  1       315
21      64  0       0
23      40  1       850
24      10  1       322
26      30  1       566
27      20  0       766
28      40  1       500
29      20  1       230

这会持续更长的时间(大约1000行).

which goes for much longer(around 1000 rows).

从这个数据集中,我想创建4个单独的数据.框架/表既可以导出表,也可以进行自己的计算

From this one dataset, I would like to create 4 separate data.frames/tables I can export tables with as well as do my own calculations

我想有一个data.frame(总共4个),每个以下要点一个:

I would like to have a data.frame (4 in total), one for each of these bullet points:

  • 类型10行,其前面是类型30行
  • 类型10行,其前面是类型40行
  • 类型20行,其前面是类型30行
  • 类型20行,其后是类型40行

我希望将相关行中的所有列放入这些新表中,但只包括行类型为10或20的列信息.

I would like for all the columns in the relevant rows to be placed into these new tables, but only including the column info of row types 10 or 20.

例如,根据样本数据,第一个表(类型10,后跟类型30 )将是这样的:

For example, the first table (type 10 preceded by type 30) would like this based on the sample data:

Trial  Type Correct Latency     
  4       10     1       344
  10      10     1       350
  14      10     1       270
  17      10     1       297

第二张表(类型10,后接类型40 ):

Trial    Type  Correct  Latency     
  7       10     1       326
  20      10     1       315
  24      10     1       322

第三张表(类型20,后接类型30 ):

Trial    Type  Correct  Latency     
  27      20     0       766

第四张表(表20前面是类型40 ):

Trial    Type  Correct   Latency        
 29      20      1        230

我可以很好地子集化,以便仅获得一个类型为10行的表,为另一个类型为20行的表,但是我无法弄清楚如何根据先前的类型值为类型10和20行创建不同的表.另外,一个问题是试用版"的顺序不正确(跳过数字).

I can subset just fine to get one table only of type 10 rows and another for type 20 rows, but I can't figure out how to create different tables for type 10 and 20 rows based on the previous type value. Also, an issue is that "Trials" is not in order (skips numbers).

任何帮助将不胜感激.谢谢你.

Any help would be greatly appreciated. Thank you.

,也可以包含上一行,因此第四张表的输出如下所示:

Also, is there a way to include the previous row as well, so the output for the fourth table would look something like this:

第四张表(表20前面是类型40 ):

Trial    Type  Correct   Latency        
 28      40      1        500
 29      20      1        230

推荐答案

对于第四个示例,您可以将which()dplyr中的lag()结合使用,以获得符合您条件的索引.然后,您可以使用它们来划分data.frame.

For the fourth example, you could use which() in combination with lag() from dplyr, to attain the indices that meet your criteria. Then you can use these to subset the data.frame.

# Get indices of rows that meet condition
ind2 <- which(df$Type==20 & dplyr::lag(df$Type)==40)
# Get indices of rows before the ones that meet condition
ind1 <- which(df$Type==20 & dplyr::lag(df$Type)==40)-1

# Subset data
> df[c(ind1,ind2)]
   Trial Type Correct Latency
1:    28   40       1     500
2:    29   20       1     230

这篇关于根据上一行的值选择特定的行(在同一列中)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆