根据R中的前一行在序列中分配值 [英] Assigning values in a sequence depending on previous row in R

查看:94
本文介绍了根据R中的前一行在序列中分配值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的数据表。

  ID1 ID2 member
1   a   x parent
2   a   y  child
3   a   z parent
4   a   p  child
5   a   q  child
6   b   x parent
7   b   z parent
8   b   q  child

我想分配一个如下的序列。

And I want to assign a sequence like below.

  ID1 ID2 member sequence
1   a   x parent        1
2   a   y  child        2
3   a   z parent        1
4   a   p  child        2
5   a   q  child        3
6   b   x parent        1
7   b   z parent        1
8   b   q  child        2


$ b b

i.e.

> dt$sequence = 1, wherever dt$member == "parent"

> dt$sequence = previous_row_value + 1, wherever dt$member=="child"

使用循环,如下所示。

dt_sequence <- dt[ ,sequencing(.SD), by="ID1"]

sequencing <- function(dt){
  for(i in 1:nrow(dt)){
    if(i == 1){
      dt$sequence[i] = 1
      next
    }
    if(dt[i,member] %in% "child"){
      dt$sequence[i] = as.numeric(dt$sequence[i-1]) + 1
    }
    else
      dt$sequence[i] = 1
  }
  return(dt)
}

我在40万行的数据表上运行这个代码,很多时间来完成(大约15分钟)。

I ran this code on a data table of 400 000 rows and it took a lot of time to complete (around 15 mins). Can anyone suggest a faster way to do it.

推荐答案

这里有一种方法可以使用 seq

Here's one way with seq:

dt[ , sequence := seq(.N), by = cumsum(member == "parent")]

#    ID1 ID2 member sequence
# 1:   a   x parent        1
# 2:   a   y  child        2
# 3:   a   z parent        1
# 4:   a   p  child        2
# 5:   a   q  child        3
# 6:   b   x parent        1
# 7:   b   z parent        1
# 8:   b   q  child        2

如何运作?

命令 member ==parent创建一个逻辑向量。函数 cumsum 用于计算累积和。在这种情况下,它创建一个向量,其中父代和后续子代具有相同的编号。此向量用于分组。最后, seq(.N)创建一个从1到该组中元素数量的序列。

The command member == "parent" creates a logical vector. The function cumsum is used to calculate the cumulative sum. In this case, it creates vector in which a parent and the following childs have the same number. This vector is used for grouping. Finally, seq(.N) creates a sequence from 1 up to the number of elements in the group.

这篇关于根据R中的前一行在序列中分配值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆