在R中的一组列中遍历数据框的行 [英] Iteration through rows of a dataframe within group of columns in R
问题描述
我有一个数据帧df,其中包含6个字段A,B,C,D,E& F.我的要求是创建一个新列G,该列等于先前值(C)+先前值(D)+先前(G)-F.但这需要在组级别通过A& A列来实现. B(由A& B分组).如果它是组中的第一行,则G列中的值应等于E.
I have a dataframe df with 6 fields A,B,C,D,E & F. My requirement is to create a new column G which is equal to the previous value(C) + previous value(D) + previous (G) - F. But this needs to be implemented at a group level through columns A & B (group by A & B). In case it is the first row within the group then the value in column G should be equal to E.
样本Df-
A B C D E F
1 2 100 200 300 0
1 2 110 210 310 10
1 2 120 130 300 10
1 1 140 150 80 0
1 1 50 60 80 20
1 1 50 60 80 20
输出-
A B C D E F G
1 2 100 200 300 0 300
1 2 110 210 310 10 590
1 2 120 130 300 10 900
1 1 140 150 80 0 80
1 1 50 60 80 20 350
1 1 50 60 80 20 440
请提供合适的解决方案.
Please provide a suitable solution.
推荐答案
这里是dplyr
的一个选项,其中我们按'A','B'分组,取'C','D'的lag
,然后将'E'添加(+
),并从'F'中减去,然后与'E'列合并
Here is one option with dplyr
where we group by 'A', 'B', take the lag
of 'C', 'D', 'E' add (+
) them, and subtract from 'F', and coalesce with the 'E' column
library(dplyr)
df1 %>%
group_by(A, B) %>%
mutate(G = coalesce(lag(C) + lag(D) + lag(E) - F, E))
-输出
# A tibble: 6 x 7
# Groups: A, B [2]
# A B C D E F G
# <int> <int> <int> <int> <int> <int> <int>
#1 1 2 100 200 300 0 300
#2 1 2 110 210 310 10 590
#3 1 2 120 130 300 10 620
#4 1 1 140 150 80 0 80
#5 1 1 50 60 80 20 350
#6 1 1 50 60 80 20 170
数据
df1 <- structure(list(A = c(1L, 1L, 1L, 1L, 1L, 1L), B = c(2L, 2L, 2L,
1L, 1L, 1L), C = c(100L, 110L, 120L, 140L, 50L, 50L), D = c(200L,
210L, 130L, 150L, 60L, 60L), E = c(300L, 310L, 300L, 80L, 80L,
80L), F = c(0L, 10L, 10L, 0L, 20L, 20L)), class = "data.frame",
row.names = c(NA,
-6L))
这篇关于在R中的一组列中遍历数据框的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!