在函数中使用dplyr,使用函数参数分组错误 [英] Using dplyr within a function, Grouping Error with function arguments
问题描述
错误消息是:
错误:索引出边界
我知道通常意味着R无法找到正在调用的变量。
有趣的是,在我下面的函数示例中,如果我只通过我的 subgroup_name
(传递给函数并成为一个列新创建的数据框)函数将成功地重新组合该变量,但是我也想通过一个新创建的列(从熔体)中调用变量进行分组。
使用 regroup()
为我工作,但已被弃用。我试图使用 group_by _()
但无效。
我已经阅读了很多其他的帖子和答案,
#初始化示例数据集
数据库< - ggplot2 :: diamonds
数据库$ diamond< - row.names(diamonds)#需要熔化
subgroup_name< - cut#可以替换为color或clarity
subgroup_column < - 2#可以替换为3的颜色,4为了清晰度
#尽管最好不需要单独的变量subgroup_name和subgroup_column number
df< - database%>%
select(diamond,subgroup_column,x,y,z)%>%
melt(id.vars = c(diamond,subgroup_name))%> ;%
group_by(cut,variable)%>%
summarize(value = round(mean(value,na.rm = TRUE),2))
#不工作,我期望与以上相同的输出
subgroup_analysis< - function(d atabase,...){
df < - database%>%
select(diamond,subgroup_column,x,y,z)%>%
melt id.vars = c(diamond,subgroup_name))%>%
group_by_(subgroup_name,variable)%>%#问题似乎与找到变量
summarize(value = round (mean(value,na.rm = TRUE),2))
print(df)
}
subgroup_analysis(database,subgroup_column,subgroup_name)
从NSE 小插曲:
如果你还要输出变量来改变,你需要将
的引用对象传递给.dots参数:
这里,变量
应该被引用:
subgroup_analysis< - function (数据库,...){
df< - 数据库%>%
select(diamond,subgrou p_column,x,y,z)%>%
melt(id.vars = c(diamond,subgroup_name))%>%
group_by_(subgroup_name,quote(variable))%> ;%
summaryize(value = round(mean(value,na.rm = TRUE),2))
print(df)
}
subgroup_analysis ,subgroup_column,subgroup_name)
如果@RichardScriven提到,如果您计划将结果分配给新的变量,那么你可能想在最后删除打印
调用,只需写入 df
,甚至不分配 df
在函数中
否则,即使您执行 x < - subgroup_analysis(...)
Below I have a working example of what I would like the function to do, and then script for the function, noting where the Error occurs.
The error message is:
Error: index out of bounds
Which I know usually means R can’t find the variable that’s being called.
Interestingly, in my function example below, if I only group by my subgroup_name
(which is passed to the function and becomes a column in the newly created dataframe) the function will successfully regroup that variable, but I also want to group by a newly created column (from the melt) called variable.
Similar code used to work for me using regroup()
, but that has been deprecated. I am trying to use group_by_()
but to no avail.
I have read many other posts and answers and experimented several hours today but still not successful.
# Initialize example dataset
database <- ggplot2::diamonds
database$diamond <- row.names(diamonds) # needed for melting
subgroup_name <- "cut" # can replace with "color" or "clarity"
subgroup_column <- 2 # can replace with 3 for color, 4 for clarity
# This works, although it would be preferable not to need separate variables for subgroup_name and subgroup_column number
df <- database %>%
select(diamond, subgroup_column, x,y,z) %>%
melt(id.vars=c("diamond", subgroup_name)) %>%
group_by(cut, variable) %>%
summarise(value = round(mean(value, na.rm = TRUE),2))
# This does not work, I am expecting the same output as above
subgroup_analysis <- function(database,...){
df <- database %>%
select(diamond, subgroup_column, x,y,z) %>%
melt(id.vars=c("diamond", subgroup_name)) %>%
group_by_(subgroup_name, variable) %>% # problem appears to be with finding "variable"
summarise(value = round(mean(value, na.rm = TRUE),2))
print(df)
}
subgroup_analysis(database, subgroup_column, subgroup_name)
From the NSE vignette:
If you also want to output variables to vary, you need to pass a list of quoted objects to the .dots argument:
Here, variable
should be quoted:
subgroup_analysis <- function(database,...){
df <- database %>%
select(diamond, subgroup_column, x,y,z) %>%
melt(id.vars=c("diamond", subgroup_name)) %>%
group_by_(subgroup_name, quote(variable)) %>%
summarise(value = round(mean(value, na.rm = TRUE),2))
print(df)
}
subgroup_analysis(database, subgroup_column, subgroup_name)
As mentionned by @RichardScriven, if you plan to assign the result to a new variable, then you may want to remove the print
call at the end and just write df
, or not even assign df
at all in the function
Otherwise the result prints even when you do x <- subgroup_analysis(...)
这篇关于在函数中使用dplyr,使用函数参数分组错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!