在列的子集上使用逐行更改 [英] Using mutate rowwise over a subset of columns
问题描述
我正在尝试创建一个新列,其中将包含对小标题的列子集按行进行的计算结果, 并将此新列添加到现有小标题中。像这样:
I am trying to create a new column that will contain a result of calculations done rowwise over a subset of columns of a tibble, and add this new column to the existing tibble. Like so:
df <- tibble(
ID = c("one", "two", "three"),
A1 = c(1, 1, 1),
A2 = c(2, 2, 2),
A3 = c(3, 3, 3)
)
我实际上想从基数R进行此代码的dplyr等效操作:
I effectively want to do a dplyr equivalent of this code from base R:
df$SumA <- rowSums(df[,grepl("^A", colnames(df))])
我的问题是这不起作用:
My problem is that this doesn't work:
df %>%
select(starts_with("A")) %>%
mutate(SumA = rowSums(.))
# some code here
...因为我摆脱了 ID列以便让突变运行其他(数字)列上的rowSums。我试图在突变后在管道中绑定或bind_cols,但是它不起作用。 mutate的任何变体都不能起作用,因为它们是就地起作用的(在小节的每个像元内,而不是跨列,即使是按行也不行)。
...because I got rid of the "ID" column in order to let mutate run the rowSums over the other (numerical) columns. I have tried to cbind or bind_cols in the pipe after the mutate, but it doesn't work. None of the variants of mutate work, because they work in-place (within each cell of the tibble, and not across the columns, even with rowwise).
可以,但并不能给我一个优雅的解决方案:
This does work, but doesn't strike me as an elegant solution:
df %>%
mutate(SumA = rowSums(.[,grepl("^A", colnames(df))]))
是有没有基于tidyverse的解决方案,不需要grepl或方括号,而只需要更多标准的dplyr动词和参数?
Is there any tidyverse-based solution that does not require grepl or square brackets but only more standard dplyr verbs and parameters?
我的预期输出是:
df_out <- tibble(
ID = c("one", "two", "three"),
A1 = c(1, 1, 1),
A2 = c(2, 2, 2),
A3 = c(3, 3, 3),
SumA = c(6, 6, 6)
)
最佳
kJ
Best kJ
推荐答案
这是在 tidyverse
中使用 purrr :: pmap
。最好与实际上需要逐行运行的函数配合使用;简单添加可能会以更快的方式完成。基本上我们使用 select
将输入列表提供给 pmap
,这使我们可以使用 select
助手,例如 starts_with
或匹配项
(如果需要正则表达式)。
Here's one way to approach row-wise computation in the tidyverse
using purrr::pmap
. This is best used with functions that actually need to be run row by row; simple addition could probably be done a faster way. Basically we use select
to provide the input list to pmap
, which lets us use the select
helpers such as starts_with
or matches
if you need regex.
library(tidyverse)
df <- tibble(
ID = c("one", "two", "three"),
A1 = c(1, 1, 1),
A2 = c(2, 2, 2),
A3 = c(3, 3, 3)
)
df %>%
mutate(
SumA = pmap_dbl(
.l = select(., starts_with("A")),
.f = function(...) sum(...)
)
)
#> # A tibble: 3 x 5
#> ID A1 A2 A3 SumA
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 one 1 2 3 6
#> 2 two 1 2 3 6
#> 3 three 1 2 3 6
由 reprex包(v0.2.1)
Created on 2019-01-30 by the reprex package (v0.2.1)
这篇关于在列的子集上使用逐行更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!