ggplot将第二个数据源用于错误条失败 [英] ggplot using second data source for error bars fails

查看:105
本文介绍了ggplot将第二个数据源用于错误条失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是上一个有关获取一些自定义错误栏的问题的后续内容.

This is a follow-on to a previous question about getting some custom error bars.

  1. 情节的外观就是我所需要的,因此不必担心仅就此发表评论(尽管很高兴听到与其他帮助相关的意见)
  2. 因为这些图是在循环中生成的,并且实际上仅在满足条件的情况下才添加误差线,所以我不能简单地将所有数据预先合并在一起,因此,出于本练习的目的,假定该图数据和误差线数据来自不同的dfs.

我有一个ggplot,我尝试使用其他数据框向其中添加一些错误栏.当我调用绘图时,它表示无法从父绘图中找到y值,即使我只是尝试使用新数据添加误差线.我知道这一定是语法错误,但是我很困惑...

I have a ggplot, to which I attempt to add some error bars using a different dataframe. When I call the plot, it says that it cannot find the y values from the parent plot, even though I'm just trying to add error bars using new data. I know this has to be a syntax error but I am stumped...

首先让我们生成数据和绘图

First lets generate data and the plot

library(ggplot2)
library(scales)

# some data
data.2015 = data.frame(score = c(-50,20,15,-40,-10,60),
                       area = c("first","second","third","first","second","third"),
                       group = c("Findings","Findings","Findings","Benchmark","Benchmark","Benchmark"))

data.2014 = data.frame(score = c(-30,40,-15),
                       area = c("first","second","third"),
                       group = c("Findings","Findings","Findings"))

# breaks and limits
breaks.major = c(-60,-40,-22.5,-10, 0,10, 22.5, 40, 60)
breaks.minor = c(-50,-30,-15,-5,0, 5, 15,30,50) 
limits =c(-70,70)

# plot 2015 data
ggplot(data.2015, aes(x = area, y = score, fill = group)) +
  geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
  coord_flip() +
  scale_y_continuous(limit = limits, oob = squish, minor_breaks = breaks.minor, 
                     breaks = breaks.major)

调用图(c)会生成一个理想的图,现在让我们设置误差线,并尝试将其添加为图"c"中的新层

Calling the plot (c) produces a nice plot as expected, now lets set up the error bars and attempt to add them as a new layer in the plot "c"

# get the error bar values
alldat = merge(data.2015, data.2014, all = TRUE, by = c("area", "group"), 
               suffixes = c(".2015", ".2014"))
alldat$plotscore = with(alldat, ifelse(is.na(score.2014), NA, score.2015))
alldat$direction = with(alldat, ifelse(score.2015 < score.2014, "dec", "inc"))
alldat$direction[is.na(alldat$score.2014)] = "absent"

#add error bars to original plot
c <- c+
  geom_errorbar(data=alldat, aes(ymin = plotscore, ymax = score.2014, color = direction), 
                position = position_dodge(width = .9), lwd = 1.5, show.legend = FALSE)

当我现在呼叫c时,我会得到

When I call c now, I get

"Error in eval(expr, envir, enclos) : object 'score' not found"

当我只希望它使用第二个alldat数据帧覆盖geom_errorbar时,为什么要查找data.2015 $ score?

Why does it look for data.2015$score when I just want it to overlay the geom_errorbar using the second alldat dataframe?

编辑*我尝试使用alldata $ plotscore和alldat $ score.2014(我确定是不好的做法)为错误栏指定ymin/ymax值,但它会绘制,但这些栏的位置错误情节的位置/乱序(例如,在基准条上互换,等等)

EDIT* I've tried to specify the ymin/ymax values for the error bars using alldata$plotscore and alldat$score.2014 (which I am sure is bad practice), it plots, but the bars are in the wrong positions/out of order with the plot (e.g. swapped around, on the benchmark bars instead, etc.)

推荐答案

以我的经验,关于某个变量未找到的错误告诉我R去data.frame中寻找了一个变量,但它不存在.有时,解决方案就像解决错字一样简单,但是在您的情况下,score变量不在用于制作误差线的数据集中.

In my experience, this error about some variable not being found tells me that R went to look in a data.frame for a variable and it wasn't there. Sometimes the solution is as simple as fixing a typo, but in your case the score variable isn't in the dataset you used to make your error bars.

names(alldat)
[1] "area"       "group"      "score.2015" "score.2014" "plotscore"  "direction"

y变量是geom_errorbar必需的美观.因为您在ggplot中全局设置了y变量,所以其他几何继承了全局y,除非您将其专门映射到其他变量.在当前数据集中,您需要将y映射到2015年得分变量.

The y variable is a required aesthetic for geom_errorbar. Because you set a y variable globally within ggplot, the other geoms inherit the global y unless you specifically map it to a different variable. In the current dataset, you'll need map y to the 2015 score variable.

geom_errorbar(data=alldat, aes(y = score.2015, ymin = plotscore, 
                               ymax = score.2014, color = direction), 
              position = position_dodge(width = .9), lwd = 1.5, show.legend = FALSE)

在您的评论中,您表示还必须将fill添加到geom_errobar,但是在运行代码时,我认为没有必要(您可以在group中看到group是变量).您提供的示例中的第二个数据集).

In your comment you indicated you also had to add fill to geom_errobar, as well, but I didn't find that necessary when I ran the code (you can see above that group is a variable in the second dataset in the example you give).

另一个选择是确保合并后2015得分变量仍命名为score.这可以通过更改merge中的suffixes自变量来完成.然后score将位于第二个数据集中,而无需在geom_errorbar中设置y变量.

The other option would be to make sure the 2015 score variable is still named score after merging. This can be done by changing the suffixes argument in in merge. Then score will be in the second dataset and you won't have to set your y variable in geom_errorbar.

alldat2 = merge(data.2015, data.2014, all = TRUE, by = c("area", "group"), 
            suffixes = c("", ".2014"))
...
names(alldat2)
[1] "area"       "group"      "score"      "score.2014" "plotscore"  "direction" 

这篇关于ggplot将第二个数据源用于错误条失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆