在sum（）函数中使用列使用dplyr的mutate（）函数 [英] Use of column inside sum() function using dplyr's mutate() function

查看：171 发布时间：2017/3/26 2:15:38 r sum dataframe probability dplyr

本文介绍了在sum（）函数中使用列使用dplyr的mutate（）函数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，我想使用dplyr的mutate（）函数创建一个新列 prob 。 prob 应该包括在数据帧中比每行行有更多值的行的概率P（行值>所有列值）。这是我想做的：

  data = data.frame（value = c（1,2,3,3， 4,4,4,5,5,6,7,8,8,8,8,8,9））
 
 require（dplyr）
 
 data％> ;％mutate（prob = sum（value< data $ value）/ nrow（data））

这给出以下结果：

  value prob 
 1 1 0 
 2 2 0 
 3 3 0 
 4 3 0 
 ... ...

这里 prob 每行只包含0。如果在表达式 sum（value< data $ value）中将值替换为 2 ：

  data％>％mutate（prob = sum（2< data $ value） / nrow（data））

我得到以下结果：

 价值prob 
 1 1 0.8823529 
 2 2 0.8823529 
 3 3 0.8823529 
 4 3 0.8823529 
。 ... ...

0.8823529是有比2的值大的行的概率在数据框中。问题似乎是mutate（）函数不接受值列作为 sum（） function。

解决方案

将agstudy的代码调整为dplyr：

  data％>％mutate（prob = sapply（value，function（x）sum（x< value）/ nrow（data）））

I have a data frame and I want to create a new column prob using dplyr's mutate() function. prob should include the probability P(row value > all column values) that there are rows of greater value in the data frame than each row value. Here is what I want to do:

data = data.frame(value = c(1,2,3,3,4,4,4,5,5,6,7,8,8,8,8,8,9))

require(dplyr)

data %>% mutate(prob = sum(value < data$value) / nrow(data))

This gives the following results:

   value prob
1      1    0
2      2    0
3      3    0
4      3    0
...    ...  ...

Here prob only contains 0 for each row. If I replace value with 2 in the expression sum(value < data$value):

data %>% mutate(prob = sum(2 < data$value) / nrow(data))

I get the following results:

   value      prob
1      1 0.8823529
2      2 0.8823529
3      3 0.8823529
4      3 0.8823529
...    ...  ...

0.8823529 is the probability that there are rows of greater value than 2 in the data frame. The problem seems to be that the mutate() function doesn't accept the value column as a parameter inside the sum() function.

解决方案

adapt agstudy's code a bit into dplyr:

data %>% mutate(prob = sapply(value, function(x) sum(x < value) / nrow(data)))

这篇关于在sum（）函数中使用列使用dplyr的mutate（）函数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在sum（）函数中使用列使用dplyr的mutate（）函数 [英] Use of column inside sum() function using dplyr's mutate() function

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在sum（）函数中使用列使用dplyr的mutate（）函数 [英] Use of column inside sum() function using dplyr&#39;s mutate() function

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

在sum（）函数中使用列使用dplyr的mutate（）函数 [英] Use of column inside sum() function using dplyr's mutate() function

登录关闭