在列的子集上使用逐行更改 [英] Using mutate rowwise over a subset of columns

查看：73 发布时间：2020/10/26 3:13:35 r dplyr

本文介绍了在列的子集上使用逐行更改的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试创建一个新列，其中将包含对小标题的列子集按行进行的计算结果，并将此新列添加到现有小标题中。像这样：

I am trying to create a new column that will contain a result of calculations done rowwise over a subset of columns of a tibble, and add this new column to the existing tibble. Like so:

df <- tibble(
ID = c("one", "two", "three"),
A1 = c(1, 1, 1),
A2 = c(2, 2, 2),
A3 = c(3, 3, 3)
)

我实际上想从基数R进行此代码的dplyr等效操作：

I effectively want to do a dplyr equivalent of this code from base R:

df$SumA <- rowSums(df[,grepl("^A", colnames(df))])

我的问题是这不起作用：

My problem is that this doesn't work:

df %>% 
select(starts_with("A")) %>% 
mutate(SumA = rowSums(.))
    # some code here

...因为我摆脱了 ID列以便让突变运行其他（数字）列上的rowSums。我试图在突变后在管道中绑定或bind_cols，但是它不起作用。 mutate的任何变体都不能起作用，因为它们是就地起作用的（在小节的每个像元内，而不是跨列，即使是按行也不行）。

...because I got rid of the "ID" column in order to let mutate run the rowSums over the other (numerical) columns. I have tried to cbind or bind_cols in the pipe after the mutate, but it doesn't work. None of the variants of mutate work, because they work in-place (within each cell of the tibble, and not across the columns, even with rowwise).

可以，但并不能给我一个优雅的解决方案：

This does work, but doesn't strike me as an elegant solution:

df %>% 
mutate(SumA = rowSums(.[,grepl("^A", colnames(df))]))

是有没有基于tidyverse的解决方案，不需要grepl或方括号，而只需要更多标准的dplyr动词和参数？

Is there any tidyverse-based solution that does not require grepl or square brackets but only more standard dplyr verbs and parameters?

我的预期输出是：

df_out <- tibble(
ID = c("one", "two", "three"),
A1 = c(1, 1, 1),
A2 = c(2, 2, 2),
A3 = c(3, 3, 3),
SumA = c(6, 6, 6)
)

最佳
kJ

Best kJ

推荐答案

这是在 tidyverse 中使用 purrr :: pmap 。最好与实际上需要逐行运行的函数配合使用；简单添加可能会以更快的方式完成。基本上我们使用 select 将输入列表提供给 pmap ，这使我们可以使用 select 助手，例如 starts_with 或匹配项（如果需要正则表达式）。

Here's one way to approach row-wise computation in the tidyverse using purrr::pmap. This is best used with functions that actually need to be run row by row; simple addition could probably be done a faster way. Basically we use select to provide the input list to pmap, which lets us use the select helpers such as starts_with or matches if you need regex.

library(tidyverse)
df <- tibble(
  ID = c("one", "two", "three"),
  A1 = c(1, 1, 1),
  A2 = c(2, 2, 2),
  A3 = c(3, 3, 3)
)

df %>%
  mutate(
    SumA = pmap_dbl(
      .l = select(., starts_with("A")),
      .f = function(...) sum(...)
    )
  )
#> # A tibble: 3 x 5
#>   ID       A1    A2    A3  SumA
#>   <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 one       1     2     3     6
#> 2 two       1     2     3     6
#> 3 three     1     2     3     6

^{由 reprex包（v0.2.1）}

^{Created on 2019-01-30 by the reprex package (v0.2.1)}

这篇关于在列的子集上使用逐行更改的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在列的子集上使用逐行更改 [英] Using mutate rowwise over a subset of columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在列的子集上使用逐行更改 [英] Using mutate rowwise over a subset of columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭