使用R来计算热图中的箱 [英] Getting counts on bins in a heat map using R

查看:473
本文介绍了使用R来计算热图中的箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题来自这两个主题:

如何使用stat_bin2d()计算ggplot2中的计数标签?





在第一个主题中,用户希望使用stat_bin2d生成热图,然后想要在热量上写入每个bin的计数地图。用户最初想要使用的方法不起作用,最好的答案是stat_bin2d被设计为使用geom =rect而不是text。没有给出令人满意的回答。第二个问题与第一个问题几乎相同,只有一个关键的区别,第二个问题的变量是文本而不是数字。答案产生了所需的结果,将bin中的计数值放置在stat_2d热图中。



为了比较两种方法,我准备了下面的代码:

  library(ggplot2)
data < - data.frame(x = rnorm(1000),y = rnorm(1000))
ggplot(data,aes(x = x,y = y))
geom_bin2d()+
stat_bin2d(geom =text,aes(label = ..)计数..))

我们知道这首先会给出错误:

错误:geom_text需要以下缺失美学:x,y。



与第一个问题相同的问题。有趣的是,从stat_bin2d更改为stat_binhex可以正常工作:

  library(ggplot2)
data < - data.frame( x = rnorm(1000),y = rnorm(1000))
ggplot(data,aes(x = x,y = y))
geom_binhex()+
stat_binhex(geom =文本,aes(label = .. count ..))

一般来说,我不认为十六进制格式是非常清楚的,并且对于我的目的而言,我不会为我试图描述的数据工作。我真的想使用stat_2d。



为了实现这个目标,我根据第二个答案准备了以下工作:

  library(ggplot2)
data < - data.frame(x = rnorm(1000),y = rnorm(1000))
x_t <-as.character(round(data $ x,.1))
y_t< -as.character(round(data $ y,.1))
x_x< -as.character(seq (-3,3),1)
y_y< -as.character(seq(-3,3),1)
data< -cbind(data,x_t,y_t)



ggplot(data,aes(x = x_t,y = y_t))+
geom_bin2d()+
stat_bin2d(geom =text,aes(label = .. count ..))+
scale_x_discrete(limits = x_x)+
scale_y_discrete(limits = y_y)

这个函数允许一个人对数字数据进行处理,但要做到这一点,您必须在将其带入ggplot之前确定bin宽度(通过舍入来确定)。在写这个问题的时候,我其实已经想清楚了,所以我不妨完成。
这就是结果:(原来我不能发布图片)

所以我真正的问题在于,任何人都有更好的方法来做这个?我很高兴我至少已经可以开始工作了,但到目前为止,我还没有看到使用数字变量将标签放置在stat_2d文件夹中的答案。



任何人都有一种方法,可以将来自stat_2dbin的x和y参数传递给geom_text,而无需使用解决方法?任何人都可以解释为什么它与文本变量一起工作,但不能与数字一起工作?

解决方案

另一个解决方法(但也许工作量较少)。类似于 .. count .. 方法,您可以分两步从plot对象中提取计数。



<$ p (1)
dat dat < 1000))

#plot
p < - ggplot(dat,aes(x = x,y = y))+ geom_bin2d()

#获取数据 - 这包括计数和x,y坐标
newdat< - ggplot_build(p)$ data [[1]]

#添加文本标签
p + geom_text(data = newdat,aes((xmin + xmax)/ 2,(ymin + ymax)/ 2,
label = count),col =white)


This question follows from these two topics:

How to use stat_bin2d() to compute counts labels in ggplot2?

How to show the numeric cell values in heat map cells in r

In the first topic, a user wants to use stat_bin2d to generate a heatmap, and then wants the count of each bin written on top of the heat map. The method the user initially wants to use doesn't work, the best answer stating that stat_bin2d is designed to work with geom = "rect" rather than "text". No satisfactory response is given.

The second question is almost identical to the first, with one crucial difference, that the variables in the second question question are text, not numeric. The answer produces the desired result, placing the count value for a bin over the bin in a stat_2d heat map.

To compare the two methods i've prepared the following code:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    ggplot(data, aes(x = x, y = y))
      geom_bin2d() + 
      stat_bin2d(geom="text", aes(label=..count..))

We know this first gives you the error:

"Error: geom_text requires the following missing aesthetics: x, y".

Same issue as in the first question. Interestingly, changing from stat_bin2d to stat_binhex works fine:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    ggplot(data, aes(x = x, y = y))
      geom_binhex() + 
      stat_binhex(geom="text", aes(label=..count..))

Which is great and all, but generally, I don't think hex binning is very clear, and for my purposes wont work for the data i'm trying to desribe. I really want to use stat_2d.

To get this to work, i've prepared the following work around based on the second answer:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    x_t<-as.character(round(data$x,.1))
    y_t<-as.character(round(data$y,.1))
    x_x<-as.character(seq(-3,3),1)
    y_y<-as.character(seq(-3,3),1)
    data<-cbind(data,x_t,y_t)



    ggplot(data, aes(x = x_t, y = y_t)) +
      geom_bin2d() + 
      stat_bin2d(geom="text", aes(label=..count..))+
      scale_x_discrete(limits =x_x) +
      scale_y_discrete(limits=y_y) 

This works around allows one to bin numerical data, but to do so, you have to determine bin width (I did it via rounding) before bringing it into ggplot. I actually figured it out while writing this question, so I may as well finish. This is the result: (turns out I can't post images)

So my real question here, is does any one have a better way to do this? I'm happy I at least got it to work, but so far I haven't seen an answer for putting labels on stat_2d bins when using a numerical variable.

Does any one have a method for passing on x and y arguments to geom_text from stat_2dbin without having to use a work around? Can any one explain why it works with text variables but not with numbers?

解决方案

Another work around (but perhaps less work). Similar to the ..count.. method you can extract the counts from the plot object in two steps.

library(ggplot2)

set.seed(1)
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))

# plot
p <- ggplot(dat, aes(x = x, y = y)) + geom_bin2d() 

# Get data - this includes counts and x,y coordinates 
newdat <- ggplot_build(p)$data[[1]]

# add in text labels
p + geom_text(data=newdat, aes((xmin + xmax)/2, (ymin + ymax)/2, 
                  label=count), col="white")

这篇关于使用R来计算热图中的箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆