如何处理来自外部向量的不同值的数据框列(使用dplyr) [英] How can I manipulate dataframe columns with different values from an external vector (with dplyr)
问题描述
在R中,我想使用存储在向量中的适当命名值(或数据框,如果更容易)来操纵(说乘)数据框。列。
假设我想要首先总结
变量 disp
, hp
和
wt
从 mtcars
数据集。
vars< - c(disp,hp,wt)
mtcars%>%
summarise_at(vars,funs sum(。))
(throw a group_by(cyl)
,或者使用 mutate_at
,如果你想要更多的行)
现在我想将每个结果列乘以一个特定的值,由
乘法器< - c(disp = 2,hp= 3,wt= 4)
是否可以引用在 summarise_at
函数中的这些?
结果应该是这样(我不想有指t他直接在那里变量名称):
disp hp wt
14766.2 14082 411.808
更新:
也许我的MWE太小了。假设我想用一个数据框执行相同的操作,按照 cyl
mtcars%>%
group_by(cyl)%>%
summarise_at(vars,sum)
因此应该是:
cyl disp hp wt
1 4 2313.0 2727 100.572
2 6 2566.4 2568 87.280
3 8 9886.8 8787 223.956
更新2:
也许我在这里还不够明确,但是数据框中的列应该乘以向量中相应的值(只有那些列在矢量中提到),所以例如 disp
应该乘以2, hp
3和 wt
4,所有其他变量(例如 cyl
)应保持不变的乘法。
我们也可以通过 map 函数从
purrr
library(purrr)
mtcars%>%
summarise_at(vars,sum)%>%
map2_df ,`*`)
#disp hp wt
#< dbl> < DBL> < DBL>
#1 14766.2 14082 411.808
更新的问题
d1< - mtcars%>%
group_by(cyl)%>%
summarise_at(vars,sum)
d1%>%
select(one_of(vars))%>%
map2_df(multiplier [vars],〜.x * .y) %>%
bind_cols(d1%>%select(-one_of(vars)))
#cyl disp hp wt
#< dbl> < DBL> < DBL> < DBL>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
或者我们可以使用收集/传播
library(tidyr)
mtcars%>%
group_by(cyl)%>%
summarise_at(vars,sum)% >%
gather(var,val,-cyl)%>%
mutate(val = val * multiplier [match(var,names(multiplier))])%>%
spread(var,val)
#cyl disp hp wt
#< dbl> < DBL> < DBL> < DBL>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
In R, I would like to manipulate (say multiply) data.frame columns with appropriately named values stored in a vector (or data.frame, if that's easier).
Let's say, I want to first summarise
the variables disp
, hp
, and wt
from the mtcars
dataset.
vars <- c("disp", "hp", "wt")
mtcars %>%
summarise_at(vars, funs(sum(.))
(throw a group_by(cyl)
into the mix, or use mutate_at
if you'd like to have more rows)
Now I'd like to multiply each of the resulting columns with a particular value, given by
multiplier <- c("disp" = 2, "hp" = 3, "wt" = 4)
Is it possible to refer to these within the summarise_at
function?
The result should look like this (and I don't want to have to refer to the variable names directly while getting there):
disp hp wt
14766.2 14082 411.808
UPDATE:
Maybe my MWE was too minimal. Let's say I want to do the same operation with a data.frame grouped by cyl
mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum)
The result should thus be:
cyl disp hp wt
1 4 2313.0 2727 100.572
2 6 2566.4 2568 87.280
3 8 9886.8 8787 223.956
UPDATE 2:
Maybe I was not explicit enough here either, but the columns in the data.frame should be multiplied by the respective values in the vector (and only those columns mentioned in the vector), so e.g. disp
should be multiplied by 2, hp
by 3 and wt
by 4, all other variables (e.g. cyl
) should remain untouched by the multiplication.
We could also do this with map
function from purrr
library(purrr)
mtcars %>%
summarise_at(vars, sum) %>%
map2_df(multiplier, `*`)
# disp hp wt
# <dbl> <dbl> <dbl>
# 1 14766.2 14082 411.808
For the updated question
d1 <- mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum)
d1 %>%
select(one_of(vars)) %>%
map2_df(multiplier[vars], ~ .x * .y) %>%
bind_cols(d1 %>% select(-one_of(vars)), .)
# cyl disp hp wt
# <dbl> <dbl> <dbl> <dbl>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
Or we can use gather/spread
library(tidyr)
mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum) %>%
gather(var, val, -cyl) %>%
mutate(val = val*multiplier[match(var, names(multiplier))]) %>%
spread(var, val)
# cyl disp hp wt
# <dbl> <dbl> <dbl> <dbl>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
这篇关于如何处理来自外部向量的不同值的数据框列(使用dplyr)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!