如何用dplyr扫描特定列？ [英] How do I sweep specific columns with dplyr?

查看：120 发布时间：2017/7/13 20:32:02 r dplyr

本文介绍了如何用dplyr扫描特定列？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于我的数据类型来说，一个令人难以置信的常见操作是将归一化因子应用于所有列。可以使用 sweep 或 scale

An incredibly common operation for my type of data is applying a normalisation factor to all columns. This can be done efficiently using sweep or scale:

normalized = scale(data, center = FALSE, scale = factors)
# or
normalized = sweep(data, 2, factors, `/`)

其中

data = structure(list(A = c(3L, 174L, 6L, 1377L, 537L, 173L),
    B = c(1L, 128L, 2L, 1019L, 424L, 139L),
    C = c(3L, 66L, 2L, 250L, 129L, 40L),
    D = c(4L, 57L, 4L, 251L, 124L, 38L)),
    .Names = c("A", "B", "C", "D"),
    class = c("tbl_df", "data.frame"), row.names = c(NA, -6L))

factors = c(A = 1, B = 1.2, C = 0.8, D = 0.75)

然而，如果我的数据在前面有其他列，那么我该如何使用dplyr呢？我可以在单独的语句中执行，但我想在一个管道中进行。这是我的数据：

However, how do I do this with dplyr, when my data has additional columns in front? I can do it in separate statements, but I’d like doing it in one pipeline. This is my data:

data = structure(list(ID = c(1, 2, 3, 4, 5, 6),
    Type = c("X", "X", "X", "Y", "Y", "Y"),
    A = c(3L, 174L, 6L, 1377L, 537L, 173L),
    B = c(1L, 128L, 2L, 1019L, 424L, 139L),
    C = c(3L, 66L, 2L, 250L, 129L, 40L),
    D = c(4L, 57L, 4L, 251L, 124L, 38L)),
    .Names = c("ID", "Type", "A", "B", "C", "D"),
    class = c("tbl_df", "data.frame"), row.names = c(NA, -6L))

我想在不触摸前两列的情况下更改数据列。通常我可以用 mutate_each 然而，我如何不能将我的规范化因素传递给该功能：

And I’d like to mutate the data columns without touching the first two columns. Normally I can do this with mutate_each; however, how I cannot pass my normalisation factors to that function:

data %>% mutate_each(funs(. / factors), A:D)

这并不奇怪，假设我想把 by factors ，而不是每一列的匹配因子。

This, unsurprisingly, assumes that I want to divide each column by factors, rather than each column by its matching factor.

推荐答案

给予akrun的鼓励，让我发表我在这里做的答案。我只是直觉地想，你可能会要求R指出具有相同名称的列，以执行 mutate_each 。例如，如果。表示列， A ，我认为另一列名为 A 从另一个data.frame可能是 dplyr 可能会喜欢的东西。所以，我创建了一个因素的数据框，然后使用 mutate_each 。看来结果是正确的。由于我没有技术背景，恐怕不能提供任何解释。我希望你不介意。

Given akrun's encouragement, let me post what I did as an answer here. I just intuitively thought that you might want to ask R to indicate columns with a same name to do this mutate_each. For instance, if . indicates the column, A, I thought another column named A from another data.frame might be something dplyr might like. So, I created a data frame for factors then used mutate_each. It seems that the outcome is right. Since I have no technical background, I am afraid that I cannot really provide any explanation. I hope you do not mind that.

factors <- data.frame(A = 1, B = 1.2, C = 0.8, D = 0.75)

mutate_each(data, funs(. / factors$.), A:D)

#  ID Type    A           B      C          D
#1  1    X    3   0.8333333   3.75   5.333333
#2  2    X  174 106.6666667  82.50  76.000000
#3  3    X    6   1.6666667   2.50   5.333333
#4  4    Y 1377 849.1666667 312.50 334.666667
#5  5    Y  537 353.3333333 161.25 165.333333
#6  6    Y  173 115.8333333  50.00  50.666667

编辑

这也可以。给定的数据框是列表的一个特殊情况，这也许不足为奇。

This also works. Given data frame is a special case of list, this is not perhaps surprising.

# Experiment
foo <- list(A = 1, B = 1.2, C = 0.8, D = 0.75)

mutate_each(data, funs(. / foo$.), A:D)

#  ID Type    A           B      C          D
#1  1    X    3   0.8333333   3.75   5.333333
#2  2    X  174 106.6666667  82.50  76.000000
#3  3    X    6   1.6666667   2.50   5.333333
#4  4    Y 1377 849.1666667 312.50 334.666667
#5  5    Y  537 353.3333333 161.25 165.333333
#6  6    Y  173 115.8333333  50.00  50.666667

这篇关于如何用dplyr扫描特定列？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何用dplyr扫描特定列？ [英] How do I sweep specific columns with dplyr?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

如何用dplyr扫描特定列？ [英] How do I sweep specific columns with dplyr?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭