带“ n()”的summary_all;功能 [英] summarize_all with "n()" function
问题描述
我正在使用 summarize_all()
函数汇总dplyr中的数据帧。如果我执行以下操作:
I'm summarizing a data frame in dplyr with the summarize_all()
function. If I do the following:
summarize_all(mydf, list(mean="mean", median="median", sd="sd"))
我对每个原始量度都有3个变量的小标题,都加了后缀按类型(平均值,中位数,标准差)分类。大!但是,当我尝试捕获向量内的n来自己计算标准偏差并确保不计算丢失的单元格时……
I get a tibble with 3 variables for each of my original measures, all suffixed by the type (mean, median, sd). Great! But when I try to capture the within-vector n's to calculate the standard deviations myself and to make sure missing cells aren't counted...
summarize_all(mydf, list(mean="mean", median="median", sd="sd", n="n"))
...我得到一个错误:
...I get an error:
Error in (function () : unused argument (var_a)
这不是我的的问题var_a
向量。如果删除它,对于 var_b
等,我将得到相同的错误。 summarize_all $ c每当我请求
并列出我要计算的描述性内容。 n
或 n()
或我使用<$ c时,$ c>函数都会产生奇怪的结果$ c> .funs()
This is not an issue with my var_a
vector. If I remove it, I get the same error for var_b
, etc. The summarize_all
function is producing odd results whenever I request n
or n()
, or if I use .funs()
and list the descriptives I want to compute instead.
发生了什么事?
推荐答案
之所以给您带来问题,是因为 n()
不接受任何参数,这与平均值()
和 median()
使用 length()
而是获得预期效果:
The reason it's giving you problems is because n()
doesn't take any arguments, unlike mean()
and median()
. Use length()
instead to get the desired effect:
summarize_all(mydf, list(mean="mean", median="median", sd="sd", n="length"))
这篇关于带“ n()”的summary_all;功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!