在ggplot2中,boxplot行的结尾代表什么? [英] In ggplot2, what do the end of the boxplot lines represent?
问题描述
我无法找到一个boxplot行代表的结束点的描述。
例如,这里是线条结束处的上下两个点值。
(我意识到箱子的顶部和底部是第25百分位和第75百分位,中线是第50位)。我假设,因为上面和下面有点,他们不代表最大值/最小值。点击查看大图
点在箱线图的末尾代表异常值。确定一个点是否是异常值有许多不同的规则,但R和ggplot使用的方法是1.5规则。如果数据点为:- 小于Q1 - 1.5 * IQR
- 大于Q3 + 1.5 * IQR
那么该点就被归类为异常值。胡须被定义为:
上胡须= min(max(x),Q_3 + 1.5 * IQR)
<其中IQR = Q_3-Q_1,盒子长度。其中IQR = Q_3-Q_1。所以上胡须位于最大x值的小和Q_3 + 1.5 IQR,
,而下胡须位于最小值的大 x值和Q_1 - 1.5 IQR。
其他信息 $ b 示例 考虑下面的例子 这给出了以下图: 当我们将范围从1.7缩小到1.5时,我们减少了胡须的长度。然而, I can't find a description of what the end points of the lines of a boxplot represent. For example, here are point values above and below where the lines end.
(I realize that the top and bottom of the box are 25th and 75th percentile, and the centerline is the 50th). I assume, as there are points above and below the lines that they do not represent the max/min values. The "dots" at the end of the boxplot represent outliers. There are a number of different rules for determining if a point is an outlier, but the method that R and ggplot use is the "1.5 rule". If a data point is: then that point is classed as an "outlier". The whiskers are defined as: upper whisker = min(max(x), Q_3 + 1.5 * IQR) lower whisker = max(min(x), Q_1 – 1.5 * IQR) where IQR = Q_3 – Q_1, the box length. So the upper whisker is located at the smaller of the maximum x value and Q_3 + 1.5 IQR,
whereas the lower whisker is located at the larger of the smallest x value and Q_1 – 1.5 IQR. Additional information Example Consider the following example This gives the following plot:
As we decrease range from 1.7 to 1.5 we reduce the length of the whisker. However, 这篇关于在ggplot2中,boxplot行的结尾代表什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
> set.seed(1)
> x = rlnorm(20,1 / 2)#swwed data
> par(mfrow = c(1,3))
> boxplot(x,range = 1.7,main =range = 1.7)
> boxplot(x,range = 1.5,main =range = 1.5)#default
> boxplot(x,range = 0,main =range = 0)#与range =非常大的数字相同
range = 0
是一个特殊情况 - 它相当于range = infinity
> set.seed(1)
> x = rlnorm(20, 1/2)#skewed data
> par(mfrow=c(1,3))
> boxplot(x, range=1.7, main="range=1.7")
> boxplot(x, range=1.5, main="range=1.5")#default
> boxplot(x, range=0, main="range=0")#The same as range="Very big number"
range=0
is a special case - it's equivalent to "range=infinity"