使用`rowSums`改变`dplyr`中的列 [英] Mutating column in `dplyr` using `rowSums`

查看:142
本文介绍了使用`rowSums`改变`dplyr`中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我偶然发现 dplyr 的奇怪行为,如果有人能提供一些见解,我会很高兴。



<假设我有一个数据,其中哪些com列包含一些数值。在一个简单的场景中,我想计算 rowSums 。尽管有很多方法可以做到,但是这里有两个示例:

  df<-data.frame(matrix(rnorm( 20),10,2),
ids = paste( i,1:20,sep =),
stringsAsFactors = FALSE)

#作品
dplyr :: select(df,-ids)%>%{rowSums(。)}

#不起作用
#错误:一元运算符
的参数无效df%>%
dplyr :: mutate(blubb = dplyr :: select(df,-ids)%>%{rowSums(。)})

#不起作用
#错误:一元运算符的无效参数
df%>%
dplyr :: mutate(blubb = dplyr :: select(。,-ids)%&%;%{rowSums(。) })

#解决方法:
tmp<-dplyr :: select(df,-ids)%>%{rowSums(。)}
df%>%
dplyr :: mutate(blubb = tmp)

#作品
rowSums(dplyr :: select(df,-ids))

#有不起作用
#错误:一元运算符
df%>%
的无效参数dplyr :: mutate(blubb = rowSums(dplyr :: select(df,-ids)))

#w orkaround
tmp<-rowSums(dplyr :: select(df,-ids))
df%>%
dplyr :: mutate(blubb = tmp)

首先,我不太了解导致错误的原因,其次,我想知道如何实现整洁的计算



edit



问题 mutate和rowSums排除列,尽管相关,但着重于使用 rowSums 进行计算。在这里,我很想了解为什么上面的示例不起作用。与其说如何解决(不是解决方法),不如说是要了解应用朴素方法时会发生什么。

解决方案

这些示例不起作用,因为您将 select 嵌套在 mutate 中并使用裸变量名。在这种情况下, select 试图做类似的事情

 > -df $ ids 
-df $ ids错误:一元运算符

的参数无效失败,因为您无法否定字符串(即- i1 - i2 没有意义)。以下任何一种公式均有效:

  df%>%mutate(blubb = rowSums(select_(。, X1,  X2))))
df%>%mutate(blubb = rowSums(select(。,-3)))

  df%>%mutate(blubb = rowSums(select_(。, -ids)))

由@Haboryme建议。


Recently I stumbled uppon a strange behaviour of dplyr and I would be happy if somebody would provide some insights.

Assuming I have a data of which com columns contain some numerical values. In an easy scenario I would like to compute rowSums. Although there are many ways to do it, here are two examples:

df <- data.frame(matrix(rnorm(20), 10, 2),
                 ids = paste("i", 1:20, sep = ""),
                 stringsAsFactors = FALSE)

# works
dplyr::select(df, - ids) %>% {rowSums(.)}

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = dplyr::select(df, - ids) %>% {rowSums(.)})

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = dplyr::select(., - ids) %>% {rowSums(.)})

# workaround:
tmp <- dplyr::select(df, - ids) %>% {rowSums(.)}
df %>%
  dplyr::mutate(blubb = tmp)

# works
rowSums(dplyr::select(df, - ids))

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = rowSums(dplyr::select(df, - ids)))

# workaround
tmp <- rowSums(dplyr::select(df, - ids))
df %>%
  dplyr::mutate(blubb = tmp)

First, I don't really understand what is causing the error and second I would like to know how to actually achieve a tidy computation of some (viable) columns in a tidy way.

edit

The question mutate and rowSums exclude columns , although related, focuses on using rowSums for computation. Here I'm eager to understand why the upper examples do not work. It is not so much about how to solve (see the workarounds) but to understand what happens when the naive approach is applied.

解决方案

The examples do not work because you are nesting select in mutate and using bare variable names. In this case, select is trying to do something like

> -df$ids
Error in -df$ids : invalid argument to unary operator

which fails because you can't negate a character string (i.e. -"i1" or -"i2" makes no sense). Either of the formulations below works:

df %>% mutate(blubb = rowSums(select_(., "X1", "X2")))
df %>% mutate(blubb = rowSums(select(., -3)))

or

df %>% mutate(blubb = rowSums(select_(., "-ids")))

as suggested by @Haboryme.

这篇关于使用`rowSums`改变`dplyr`中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆