如何在R中的一个向量中累加增加值 [英] how to cumulatively add values in one vector in R
问题描述
我有一个数据集看起来像这样
I have a data set that looks like this
id name year job job2
1 Jane 1980 Worker 0
1 Jane 1981 Manager 1
1 Jane 1982 Manager 1
1 Jane 1983 Manager 1
1 Jane 1984 Manager 1
1 Jane 1985 Manager 1
1 Jane 1986 Boss 0
1 Jane 1987 Boss 0
2 Bob 1985 Worker 0
2 Bob 1986 Worker 0
2 Bob 1987 Manager 1
2 Bob 1988 Boss 0
2 Bob 1989 Boss 0
2 Bob 1990 Boss 0
2 Bob 1991 Boss 0
2 Bob 1992 Boss 0
这里, job2
表示一个虚拟变量,指示某人是否为经理
在那一年。我想对这个数据集做两件事情:首先,我只想在第一次成为 Boss
时保留行。其次,我想看一个人作为一个经理
的累积年数,并将这些信息存储在变量 cumu_job2
中。因此,我想要:
Here, job2
denotes a dummy variable indicating whether a person was a Manager
during that year or not. I want to do two things to this data set: first, I only want to preserve the row when the person became Boss
for the first time. Second, I would like to see cumulative years a person worked as a Manager
and store this information in the variable cumu_job2
. Thus I would like to have:
id name year job job2 cumu_job2
1 Jane 1980 Worker 0 0
1 Jane 1981 Manager 1 1
1 Jane 1982 Manager 1 2
1 Jane 1983 Manager 1 3
1 Jane 1984 Manager 1 4
1 Jane 1985 Manager 1 5
1 Jane 1986 Boss 0 0
2 Bob 1985 Worker 0 0
2 Bob 1986 Worker 0 0
2 Bob 1987 Manager 1 1
2 Bob 1988 Boss 0 0
我已经改变了我的例子,并包含了工作人员的职位,因为这更反映了我想要处理的原始数据集。这个线程中的答案只有在数据集中只有经理和老板才有效 - 所以任何关于这个工作的建议都会很好。我将非常感谢!
I have changed my examples and included the Worker position because this reflects more what I want to do with the original data set. The answers in this thread only works when there are only Managers and Boss in the data set - so any suggestions for making this work would be great. I'll be very much grateful!!
推荐答案
这是简洁的 dplyr
解决方案相同的问题。
Here is the succinct dplyr
solution for the same problem.
注意:在读取数据时,请确保 stringsAsFactors = F
。
NOTE: Make sure that stringsAsFactors = F
while reading in the data.
library(dplyr)
dat %.%
group_by(name, job) %.%
filter(job != "Boss" | year == min(year)) %.%
mutate(cumu_job2 = cumsum(job2))
输出:
id name year job job2 cumu_job2
1 1 Jane 1980 Worker 0 0
2 1 Jane 1981 Manager 1 1
3 1 Jane 1982 Manager 1 2
4 1 Jane 1983 Manager 1 3
5 1 Jane 1984 Manager 1 4
6 1 Jane 1985 Manager 1 5
7 1 Jane 1986 Boss 0 0
8 2 Bob 1985 Worker 0 0
9 2 Bob 1986 Worker 0 0
10 2 Bob 1987 Manager 1 1
11 2 Bob 1988 Boss 0 0
说明
- 获取数据集
- 按名称和工作分组
- 根据条件过滤每个组
- 添加
cumu_job2
列。
- Take the dataset
- Group by name and job
- Filter each group based on condition
- Add
cumu_job2
column.
这篇关于如何在R中的一个向量中累加增加值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!