如何使用R绘制长尾数据的直方图? [英] How can I plot a histogram of a long-tailed data using R?

查看:118
本文介绍了如何使用R绘制长尾数据的直方图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到的数据大多集中在一个较小的范围内(1-10),但是有很多点(例如10%)在(10-1000)之间.我想为此数据绘制一个直方图,该直方图将重点放在(1-10)上,但也会显示(10-1000)数据.类似于直方图的对数刻度.

I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.

是的,我知道这意味着并非所有垃圾箱都大小相等

Yes, i know this means not all bins are of equal size

一个简单的hist(x)给出 而hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000)))给出

A simple hist(x) gives while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives

我想要的都不是

更新 按照这里的答案,我现在产生的东西几乎就是我想要的(我用连续图代替了条形图):

update following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):

breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]

唯一的问题是,我想在比例尺和绘制的实际条形图之间进行匹配.有两种选择:一种是简单地使用绘制的条形图的实际边距(如何?),然后获得丑陋的" x轴标签,例如1.1754、1.2985等.另一种是我更喜欢的方法是控制实际合并使用的边距,以便与中断相匹配.

the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.

推荐答案

使用ggplot2似乎是最简单的选择.如果您想更好地控制轴和中断,可以执行以下操作:

Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :

提供了新代码

x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)

breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)


H <- hist(log10(x),plot=F)


plot(H$mids,H$counts,type="n",
      xaxt="n",
      xlab="X",ylab="Counts",
      main="Histogram of X",
      bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)

#Creation X axis
axis(1,at=at,labels=10^at)

这与ggplot2差不多.将背景设置为灰色并不是一件容易的事,但是如果您使用绘图屏幕的大小定义一个矩形并将背景设置为灰色,则可行.

This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.

检查我使用的所有功能,以及?par.它将允许您构建自己的图形.希望这会有所帮助.

Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.

这篇关于如何使用R绘制长尾数据的直方图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆