ggplot2如何获得2个直方图与y值=计数一个/两个计数的总和 [英] ggplot2 how to get 2 histograms with the y value = to count of one / sum of the count of both

查看:191
本文介绍了ggplot2如何获得2个直方图与y值=计数一个/两个计数的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我的问题很简单(即使标题不是......),但我还没有找到明确的答案。我想在心理物理学任务中绘制反应时间的直方图。我需要在同一个图上绘制其中的两个:一个用于正确的响应,另一个用于不正确的响应。



我不想绘制绝对计数,但是而相对比例对应于:

正确答案: count(correct == 1)/ sum(count(correct == 1)+计数(正确== 0))



对于不正确的响应: count(correct == 0)/ sum计数(正确== 1)+计数(正确== 0))



现在我有了:

  ggplot(数据,aes(x = RT,color = correct))
+ geom_histogram(aes(y = ..count ../ sum(.. count)))
+ stat_bin(breaks = seq(5,800,by = 10))

但我不确定它是在做我想做的事(是对应于正确和不正确答案总和的总和?)。对于..count等等我感觉不舒服,任何人都会对这方面的文档有很好的建议吗?



预先感谢。



编辑:输入数据为:

  df结构(一览表(RT = c(359L,214L,219L,206L,120L,166L,156L,
181L,135L,122L,110L,101L,139L,215L,106L,217L,162L,135L (1L,1L,1L,1L,1L,1L,1L,1L,1L,
1L,1L,1L,1L,1L,1L,1L, 0L,1L,0L,0L)),.Names = c(RT,
correct),class =data.frame,row.names = c(NA,-20L))

下面是我之前使用base R进行绘制的链接,它正好是我想要的结果。
https://www.dropbox.com/s/nqn83pkoq7o0stv/RTexample.png
这些是行(但基于直方图,黄色表示正确== 1,蓝色表示正确== 0)。我想要的特定功能是将两者合在一起总计为1.

解决方案

shora,



Brian Hanson绝对正确。作为'ggplot'功能的一部分,你应该停止尝试做你的转换。我知道这很诱人,但'ggplot'的剧情转换方法应该更多地用于数据探索,而不是创建预定的图形。您可以快速使用'hist'函数来获取所需的数据,转换数据,然后将其输入到'ggplot'中以获得实际的图形。手动转换数据最好的部分是你可以看到所有的动作,并且你不会有问题(如你的问题)猜测答案是否正确。



您需要准确决定如何安排这两个地块,但这可以使用'ggplot'来完成。以下是外部转换的示例:

第1步:获取[correct] = 1的直方图值。 b
$ b pre $ correct_Hist <-hist(data [correct == 1,1],bre​​aks = seq(5,800,by = 10),plot = FALSE)

第2步:获取直方图值[correct] = 0。

  incorrect_Hist < -  hist(data [correct == 0,1],bre​​aks = seq(5,800, by = 10),plot = FALSE)

第3步:转换计数。你在这个问题上的解释有点模棱两可,可以采取几种不同的方式。对于这个答案,我假设你不想要直方图,而是想要一个条形图,显示RT值的特定范围的百分比由不正确或正确的响应表示。这是非常简单的,因为我们有计数。

  correct_Bar_Values < -  correct_Hist $ counts /(correct_Hist $ counts + incorrect_Hist $计数)
incorrect_Bar_Values< - incorrect_Hist $ counts /(correct_Hist $ counts + incorrect_Hist $计数)

第4步:绘制它,不过你喜欢。既然您有想要绘制的原始值,您可以使用各种方法来绘制它们。我推荐'geom_bar'图层,而不是'geom_hist'图层,因为您已经完成了计算。您还必须指定要使用'ggplot'的两个不同'网格'视口,但如果您需要帮助,请提交第二个问题。这是您可以快速将数据转化为条形图的方式:

 #不正确答案的百分比
qplot(incorrect_Hist $ mids,y = incorrect_Bar_Values,geom =bar,stat =identity,ylim = c(0,1))

#正确回答的百分比
qplot(correct_Hist $ mids,y = correct_Bar_Values,geom =bar,stat =identity,ylim = c(0,1))


I guess my question is simple (even if the title is not...) but I was not able to find any clear answer yet. I want to plot histograms of Reaction Times in a psychophysics task. I need to plot two of them on the same figure: one for correct responses, the other for incorrect responses.

I don't want to plot the absolute counts, but rather the relative proportion corresponding to:

For correct responses: count(correct==1) / sum(count(correct==1) + count(correct==0))

For incorrect responses: count(correct==0) / sum(count(correct==1) + count(correct==0))

For now I have that:

ggplot(data, aes(x=RT, color=correct)) 
    + geom_histogram(aes(y = ..count../sum(..count..))) 
    + stat_bin(breaks = seq(5,800,by=10))

But I'm not sure it is doing what I want (is the sum corresponding to the sum of both correct and incorrect responses?). I don't feel comfortable with the ..count.. etc, would anyone have a good recommendation for documentation about this aspect?

Thanks in advance.

Edit: The input data is:

df <- structure(list(RT = c(359L, 214L, 219L, 206L, 120L, 166L, 156L, 
       181L, 135L, 122L, 110L, 101L, 139L, 215L, 106L, 217L, 162L, 135L, 
       114L, 205L), correct = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
       1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L)), .Names = c("RT", 
       "correct"), class = "data.frame", row.names = c(NA, -20L))

Here is a link to a plot I made earlier using base R which is exactly the output I want at the end. https://www.dropbox.com/s/nqn83pkoq7o0stv/RTexample.png These are lines (but based on histograms, yellow for correct==1, blue for correct==0). The specific feature that I want is that both line together sum up to 1.

解决方案

shora,

Brian Hanson is absolutely correct. You really should stop trying to do your transformation as part of the 'ggplot' function. I know it's tempting, but the in-plot transformation methods of 'ggplot' should be used more for data exploration rather than the creation of a predetermined graph. You can quickly use the 'hist' function to get the data you need, transform the data, and then feed it into 'ggplot' for the actual graphing. The best part about transforming your data manually is that you get to see all of it in action, and you won't have problems (as in your question) guessing whether or not the answers are correct.

You'll need to decide exactly how you want the two plots arranged, but that can all be done with 'ggplot'. Here is an example of an outside transformation:

Step 1: Get the histogram values for [correct]=1.

correct_Hist <- hist(data[correct==1, 1], breaks=seq(5, 800, by=10), plot=FALSE)

Step 2: Get the histogram values for [correct]=0.

incorrect_Hist <- hist(data[correct==0, 1], breaks=seq(5, 800, by=10), plot=FALSE)

Step 3: Transform the counts. Your explanation in the question is a bit ambiguous, and could be taken a couple different ways. For this answer, I am assuming you do not want a histogram but rather that you want a bar chart that shows what percentage of a specific range of RT values is represented by incorrect or correct responses. This is quite simple now that we have the counts.

correct_Bar_Values <- correct_Hist$counts / (correct_Hist$counts + incorrect_Hist$counts)
incorrect_Bar_Values <- incorrect_Hist$counts / (correct_Hist$counts + incorrect_Hist$counts)

Step 4: Plot it however you like. Now that you have the raw values you want to plot, you can use any variety of methods to get it plotted. I recommend the 'geom_bar' layer, rather than the 'geom_hist' layer, since you have already done the calculations. You'll have to also specify the two different 'grid' viewports you want 'ggplot' to use, but if you need help with that, submit a second question. This is how you can quickly make your data into a bar chart:

# The percentage of answers that were not correct
qplot(incorrect_Hist$mids,y=incorrect_Bar_Values, geom="bar", stat="identity", ylim=c(0,1))

# The percentage of answers that were correct
qplot(correct_Hist$mids,y=correct_Bar_Values, geom="bar", stat="identity", ylim=c(0,1))

这篇关于ggplot2如何获得2个直方图与y值=计数一个/两个计数的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆