如何在R中分离直方图的最左边的两个bin [英] How to separate the two leftmost bins of a histogram in R

查看:109
本文介绍了如何在R中分离直方图的最左边的两个bin的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我需要绘制一个如下的数据集:

Suppose I need to plot a dataset like below:

set.seed(1)
dataset <- sample(1:7, 1000, replace=T)
hist(dataset)

如下图所示,最左边的两个垃圾箱之间没有任何空间,这与其余垃圾箱不同.

As you can see in the plot below, the two leftmost bins do not have any space between them unlike the rest of the bins.

我尝试更改xlim,但是没有用.基本上,我希望将每个数字(1到7)表示为一个垃圾箱,此外,我希望任意两个相邻的垃圾箱之间都具有空格...谢谢!

I tried changing xlim, but it didn't work. Basically I would like to have each number (1 to 7) represented as a bin, and additionally, I would like any two adjacent bins to have space beween them...Thanks!

推荐答案

最好的方法是手动设置breaks参数.使用代码中的数据,

The best way is to set the breaks argument manually. Using the data from your code,

hist(dataset,breaks=rep(1:7,each=2)+c(-.4,.4))

给出以下情节:

第一部分rep(1:7,each=2)是您希望条形居中的数字.第二部分控制条的宽度.如果将其更改为c(-.49,.49),它们几乎会碰到;如果将其更改为c(-.3,.3),则条会变窄.如果将其设置为c(-.5,.5),则R会对您大吼大叫,因为不允许两次在您的breaks向量中使用相同的数字.

The first part, rep(1:7,each=2), is what numbers you want the bars centered around. The second part controls how wide the bars are; if you change it to c(-.49,.49) they'll almost touch, if you change it to c(-.3,.3) you get narrower bars. If you set it to c(-.5,.5) then R yells at you because you aren't allowed to have the same number in your breaks vector twice.

为什么这样做?

如果拆分中断向量,则得到的部分看起来像这样:

If you split up the breaks vector, you get one part that looks like this:

> rep(1:7,each=2)
 [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7

和第二部分如下:

> c(-.4,.4)
 [1] -0.4  0.4

将它们加在一起时,R会根据需要循环遍历第二个向量,使其与第一个向量一样长.这样你就结束了

When you add them together, R loops through the second vector as many times as needed to make it as long as the first vector. So you end up with

  1-0.4  1+0.4  2-0.4  2+0.4  3-0.4  3+0.4 [etc.]
=   0.6    1.4    1.6    2.4    2.6    3.4 [etc.]

因此,您有一个从0.6到1.4的条形图,以1为中心,宽度为2 * .4--另一个从1.6到2.4的条形图以2 * .4为中心,以2为中心,依此类推.如果您的数据介于两者之间(例如2.5),则直方图看起来会很愚蠢,因为它会创建从2.4到2.6的条形,并且条形宽度不会均匀(因为该条形只有0.2宽,而所有其他都是.8).但是只有整数值就没问题.

Thus, you have one bar from 0.6 to 1.4--centered around 1, with width 2*.4--another bar from 1.6 to 2.4 centered around 2 with with 2*.4, and so on. If you had data in between (e.g. 2.5) then the histogram would look kind of silly, because it would create a bar from 2.4 to 2.6, and the bar widths would not be even (since that bar would only be .2 wide, while all the others are .8). But with only integer values that's not a problem.

这篇关于如何在R中分离直方图的最左边的两个bin的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆