为什么我的R直方图中的第一个柱形这么大? [英] Why is the first bar so big in my R histogram?
问题描述
我正在玩R.我尝试使用以下R脚本来可视化1000个掷骰子的分布:
I'm playing around with R. I try to visualize the distribution of 1000 dice throws with the following R script:
cases <- 1000
min <- 1
max <- 6
x <- as.integer(runif(cases,min,max+1))
mx <- mean(x)
sd <- sd(x)
hist(
x,
xlim=c(min - abs(mx/2),max + abs(mx/2)),
main=paste(cases,"Samples"),
freq = FALSE,
breaks=seq(min,max,1)
)
curve(dnorm(x, mx, sd), add = TRUE, col="blue", lwd = 2)
abline(v = mx, col = "red", lwd = 2)
legend("bottomleft",
legend=c(paste('Mean (', mx, ')')),
col=c('red'), lwd=2, lty=c(1))
该脚本产生以下直方图:
The script produces the following histogram:
有人可以向我解释为什么第一根柱子这么大吗?我检查了数据,看起来不错.我该如何解决?
Can someone explain to me why the first bar is so big? I've checked the data and it looks fine. How can I fix this?
提前谢谢!
推荐答案
直方图不适用于离散数据,它们是为连续数据而设计的.您的数据如下所示:
Histograms aren't good for discrete data, they're designed for continuous data. Your data looks something like this:
> table(x)
x
1 2 3 4 5 6
174 138 162 178 196 152
即每个值的数量大致相等.但是,当您将其放在直方图中时,您选择的断点为1:6.第一个小节的左侧限制为174个条目,右侧限制为138个条目,因此显示312.
i.e. roughly equal numbers of each value. But when you put that in a histogram, you chose breakpoints at 1:6. The first bar has 174 entries on its left limit, and 138 on its right limit, so it displays 312.
通过指定半个整数处的间隔(即 breaks = 0:6 + 0.5
),您可以获得更好看的直方图,但是对于像这样的数据使用直方图仍然没有任何意义这.只需运行 plot(table(x))
或 barplot(table(x))
即可更准确地描述数据.
You could get a better looking histogram by specifying breaks at the half integers, i.e. breaks = 0:6 + 0.5
, but it still doesn't make sense to be using a histogram for data like this. Simply running plot(table(x))
or barplot(table(x))
gives a more accurate depiction of the data.
这篇关于为什么我的R直方图中的第一个柱形这么大?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!