如何处理来自外部向量的不同值的数据框列（使用dplyr） [英] How can I manipulate dataframe columns with different values from an external vector (with dplyr)

查看：122 发布时间：2017/7/13 22:29:50 r dplyr

本文介绍了如何处理来自外部向量的不同值的数据框列（使用dplyr）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在R中，我想使用存储在向量中的适当命名值（或数据框，如果更容易）来操纵（说乘）数据框。列。

假设我想要首先总结变量 disp ， hp 和 wt 从 mtcars 数据集。

  vars<  -  c（disp，hp，wt）
 mtcars％>％
 summarise_at（vars，funs sum（。））

（throw a group_by（cyl），或者使用 mutate_at ，如果你想要更多的行）

 
 
 现在我想将每个结果列乘以一个特定的值，由
 乘法器<  -  c（disp = 2，hp= 3，wt= 4）
  
是否可以引用在 summarise_at 函数中的这些？
 
 
 结果应该是这样（我不想有指t他直接在那里变量名称）：
  disp hp wt 
 14766.2 14082 411.808 
  
更新：
 
 
 也许我的MWE太小了。假设我想用一个数据框执行相同的操作，按照 cyl  
  mtcars％>％
 group_by（cyl）％>％
 summarise_at（vars，sum）
  
因此应该是：
  cyl disp hp wt 
 1 4 2313.0 2727 100.572 
 2 6 2566.4 2568 87.280 
 3 8 9886.8 8787 223.956 
  
更新2：
 
 
 也许我在这里还不够明确，但是数据框中的列应该乘以向量中相应的值（只有那些列在矢量中提到），所以例如 disp 应该乘以2， hp  3和 wt  4，所有其他变量（例如 cyl ）应保持不变的乘法。
解决方案
我们也可以通过 map 函数从 purrr  
  library（purrr）
 mtcars％>％
 summarise_at（vars，sum）％>％
 map2_df ，`*`）
＃disp hp wt 
＃< dbl> < DBL> < DBL> 
＃1 14766.2 14082 411.808 
  
 
 
 
 
 
 更新的问题
  d1<  -  mtcars％>％
 group_by（cyl）％>％
 summarise_at（vars，sum）
 d1％>％
 select（one_of（vars））％>％
 map2_df（multiplier [vars]，〜.x * .y） ％>％
 bind_cols（d1％>％select（-one_of（vars）））
＃cyl disp hp wt 
＃< dbl> < DBL> < DBL> < DBL> 
＃1 4 2313.0 2727 100.572 
＃2 6 2566.4 2568 87.280 
＃3 8 9886.8 8787 223.956 
  
 
 
 
 
 
 或者我们可以使用收集/传播 
  library（tidyr）
 mtcars％>％
 group_by（cyl）％>％
 summarise_at（vars，sum）％ >％
 gather（var，val，-cyl）％>％
 mutate（val = val * multiplier [match（var，names（multiplier））]）％>％
 spread（var，val）
＃cyl disp hp wt 
＃< dbl> < DBL> < DBL> < DBL> 
＃1 4 2313.0 2727 100.572 
＃2 6 2566.4 2568 87.280 
＃3 8 9886.8 8787 223.956 
  
 
In R, I would like to manipulate (say multiply) data.frame columns with appropriately named values stored in a vector (or data.frame, if that's easier).

Let's say, I want to first summarise the variables disp, hp, and wt from the mtcars dataset.
vars <- c("disp", "hp", "wt")
mtcars %>% 
  summarise_at(vars, funs(sum(.))
(throw a group_by(cyl) into the mix, or use mutate_at if you'd like to have more rows) 

Now I'd like to multiply each of the resulting columns with a particular value, given by 
multiplier <- c("disp" = 2, "hp" = 3, "wt" = 4)
Is it possible to refer to these within the summarise_at function? 

The result should look like this (and I don't want to have to refer to the variable names directly while getting there):
disp    hp    wt
14766.2 14082 411.808
UPDATE: 

Maybe my MWE was too minimal. Let's say I want to do the same operation with a data.frame grouped by cyl
mtcars %>% 
  group_by(cyl) %>% 
  summarise_at(vars, sum) 
The result should thus be:
    cyl   disp   hp      wt
1     4 2313.0 2727 100.572
2     6 2566.4 2568  87.280
3     8 9886.8 8787 223.956
UPDATE 2:

Maybe I was not explicit enough here either, but the columns in the data.frame should be multiplied by the respective values in the vector (and only those columns mentioned in the vector), so e.g. disp should be multiplied by 2, hp by 3 and wt by 4, all other variables (e.g. cyl) should remain untouched by the multiplication.
 解决方案 
We could also do this with map function from purrr
library(purrr)
mtcars %>%
    summarise_at(vars, sum) %>%
    map2_df(multiplier, `*`)
#      disp    hp      wt
#     <dbl> <dbl>   <dbl>
# 1 14766.2 14082 411.808




For the updated question
d1 <- mtcars %>% 
         group_by(cyl) %>% 
         summarise_at(vars, sum) 
d1 %>% 
   select(one_of(vars)) %>% 
   map2_df(multiplier[vars], ~ .x * .y) %>%
   bind_cols(d1 %>% select(-one_of(vars)), .) 
#    cyl   disp    hp      wt
#  <dbl>  <dbl> <dbl>   <dbl>
#1     4 2313.0  2727 100.572
#2     6 2566.4  2568  87.280
#3     8 9886.8  8787 223.956




Or we can use gather/spread
library(tidyr)
mtcars %>% 
    group_by(cyl) %>% 
    summarise_at(vars, sum) %>% 
    gather(var, val, -cyl) %>% 
    mutate(val = val*multiplier[match(var, names(multiplier))]) %>% 
    spread(var, val)
#     cyl   disp    hp      wt
#   <dbl>  <dbl> <dbl>   <dbl>
#1     4 2313.0  2727 100.572
#2     6 2566.4  2568  87.280
#3     8 9886.8  8787 223.956


                        
这篇关于如何处理来自外部向量的不同值的数据框列（使用dplyr）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何处理来自外部向量的不同值的数据框列（使用dplyr） [英] How can I manipulate dataframe columns with different values from an external vector (with dplyr)

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

如何处理来自外部向量的不同值的数据框列（使用dplyr） [英] How can I manipulate dataframe columns with different values from an external vector (with dplyr)

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭