为什么我的R直方图中的第一个柱形这么大? [英] Why is the first bar so big in my R histogram?

查看:52
本文介绍了为什么我的R直方图中的第一个柱形这么大?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在玩R.我尝试使用以下R脚本来可视化1000个掷骰子的分布:

I'm playing around with R. I try to visualize the distribution of 1000 dice throws with the following R script:

cases <- 1000

min <- 1
max <- 6

x <- as.integer(runif(cases,min,max+1))
mx <- mean(x)
sd <- sd(x)

hist(
  x,
  xlim=c(min - abs(mx/2),max + abs(mx/2)),
  main=paste(cases,"Samples"),
  freq = FALSE,
  breaks=seq(min,max,1)
)

curve(dnorm(x, mx, sd), add = TRUE, col="blue", lwd = 2)
abline(v = mx, col = "red", lwd = 2)

legend("bottomleft", 
       legend=c(paste('Mean (', mx, ')')), 
       col=c('red'), lwd=2, lty=c(1))

该脚本产生以下直方图:

The script produces the following histogram:

有人可以向我解释为什么第一根柱子这么大吗?我检查了数据,看起来不错.我该如何解决?

Can someone explain to me why the first bar is so big? I've checked the data and it looks fine. How can I fix this?

提前谢谢!

推荐答案

直方图不适用于离散数据,它们是为连续数据而设计的.您的数据如下所示:

Histograms aren't good for discrete data, they're designed for continuous data. Your data looks something like this:

> table(x)
x
  1   2   3   4   5   6 
174 138 162 178 196 152 

即每个值的数量大致相等.但是,当您将其放在直方图中时,您选择的断点为1:6.第一个小节的左侧限制为174个条目,右侧限制为138个条目,因此显示312.

i.e. roughly equal numbers of each value. But when you put that in a histogram, you chose breakpoints at 1:6. The first bar has 174 entries on its left limit, and 138 on its right limit, so it displays 312.

通过指定半个整数处的间隔(即 breaks = 0:6 + 0.5 ),您可以获得更好看的直方图,但是对于像这样的数据使用直方图仍然没有任何意义这.只需运行 plot(table(x)) barplot(table(x))即可更准确地描述数据.

You could get a better looking histogram by specifying breaks at the half integers, i.e. breaks = 0:6 + 0.5, but it still doesn't make sense to be using a histogram for data like this. Simply running plot(table(x)) or barplot(table(x)) gives a more accurate depiction of the data.

这篇关于为什么我的R直方图中的第一个柱形这么大?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆