在R中按组计算先前项目的数量 [英] Count the number previous items in by group in R

查看：103 发布时间：2020/10/10 21:06:11 r counting

本文介绍了在R中按组计算先前项目的数量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想创建一个新变量，该变量按组统计上一个项目的数量。这就是我的意思，以 esoph 数据集为例。

I would like to create a new variable which counts the number of previous items in a by group. Here is what I mean, taking the esoph dataset as an example.

首先，我按组 esoph $ agegp，esoph $ alcgp 和附加值列 -esoph $ ncontrols 。

first I sort the dataset by my by group esoph$agegp, esoph$alcgp and an additional value column -esoph$ncontrols.

这给了我以下数据集

x<-esoph[order(esoph$agegp, esoph$alcgp, -esoph$ncontrols ), ]
x

   agegp     alcgp    tobgp ncases ncontrols
1  25-34 0-39g/day 0-9g/day      0        40
2  25-34 0-39g/day    10-19      0        10
3  25-34 0-39g/day    20-29      0         6
4  25-34 0-39g/day      30+      0         5
5  25-34     40-79 0-9g/day      0        27
6  25-34     40-79    10-19      0         7
8  25-34     40-79      30+      0         7
7  25-34     40-79    20-29      0         4
9  25-34    80-119 0-9g/day      0         2
11 25-34    80-119      30+      0         2
...

现在，我会ke创建一个具有某种索引的新变量，每行增加一个。每当下一个按组分组开始时，索引就会返回到1。

Now, I would like to create a new variable with some sort of index, increasing by one on every row. Whenever the next by group starts, the index goes back to 1.

结果表如下（带有附加索引列）：

The resulting table would be the following (with the additional index column):

   agegp     alcgp    tobgp ncases ncontrols index
1  25-34 0-39g/day 0-9g/day      0        40     1
2  25-34 0-39g/day    10-19      0        10     2
3  25-34 0-39g/day    20-29      0         6     3
4  25-34 0-39g/day      30+      0         5     4
5  25-34     40-79 0-9g/day      0        27     1
6  25-34     40-79    10-19      0         7     2
8  25-34     40-79      30+      0         7     3
7  25-34     40-79    20-29      0         4     4
9  25-34    80-119 0-9g/day      0         2     1
11 25-34    80-119      30+      0         2     2
...

如何计算此列？

谢谢！

推荐答案

可以使用任何专用软件包su ch为 dplyr ，其中具有 row_number（）。我们需要对变量（'alcgp'）进行分组，并使用 mutate 创建一个新列。

This can be approached using either specialized packages such as dplyr which has row_number(). We need to group by the variable ('alcgp') and create a new column using mutate.

library(dplyr)
df1 %>%
   group_by( alcgp) %>%
   mutate(indx= row_number())

或使用 base R中的 ave code>。我们按 alcgp分组，在 FUN 中，我们可以指定 seq_along 。我使用了 seq_along（alcgp），因为如果变量是 factor 类，它可能不起作用。


Or using ave from base R.  We group by 'alcgp' and in the FUN we can specify seq_along.  I used seq_along(alcgp) as it may not work if the variable is factor class.
 df1$indx <- with(df1, ave(seq_along(alcgp), alcgp, FUN=seq_along))

  splitstackshape 中的另一个便捷函数，即 getanID  
Another convenient function in splitstackshape i.e. getanID
 library(splitstackshape)
 getanID(df1, 'alcgp')


                        这篇关于在R中按组计算先前项目的数量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在R中按组计算先前项目的数量 [英] Count the number previous items in by group in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在R中按组计算先前项目的数量 [英] Count the number previous items in by group in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭