四分位数间距应如何在Python中计算? [英] How should the interquartile range be calculated in Python?

查看:1224
本文介绍了四分位数间距应如何在Python中计算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数字列表[1, 2, 3, 4, 5, 6, 7],并且我想要一个函数来返回此数字列表的四分位数范围.四分位间距是上四分位和下四分位之间的差.我尝试使用NumPy函数和Wolfram Alpha计算四分位数范围.我发现所有答案(从我的手册到NumPy到Wolfram Alpha)都是不同的.我不知道为什么会这样.

I have a list of numbers [1, 2, 3, 4, 5, 6, 7] and I want to have a function to return the interquartile range of this list of numbers. The interquartile range is the difference between the upper and lower quartiles. I have attempted to calculate the interquartile range using NumPy functions and using Wolfram Alpha. I find all of the answers, from my manual one, to the NumPy one, tothe Wolfram Alpha, to be different. I do not know why this is.

我在Python中的尝试如下:

My attempt in Python is as follows:

>>> a = numpy.array([1, 2, 3, 4, 5, 6, 7])
>>> numpy.percentile(a, 25)
2.5
>>> numpy.percentile(a, 75)
5.5
>>> numpy.percentile(a, 75) - numpy.percentile(a, 25) # IQR
3.0

我在Wolfram Alpha中的尝试如下:

My attempt in Wolfram Alpha is as follows:

  • "first quartile 1, 2, 3, 4, 5, 6, 7": 2.25
  • "third quartile 1, 2, 3, 4, 5, 6, 7": 5.75
  • (comment: 5.75 - 2.25 = 3.5)
  • "interquartile range 1, 2, 3, 4, 5, 6, 7": ~3.5

因此,我发现NumPy和Wolfram Alpha返回的值与我认为的第一四分位数,第三四分位数和四分位数间距不一致.为什么是这样?我应该在Python中做什么才能正确计算四分位数范围?

So, I find that the values returned by NumPy and Wolfram Alpha for what I think are the first quartile, the third quartile and the interquartile range are not consistent. Why is this? What should I be doing in Python to calculate the interquartile range correctly?

据我所知,[1, 2, 3, 4, 5, 6, 7]的四分位数范围应为:

As far as I am aware, the interquartile range of [1, 2, 3, 4, 5, 6, 7] should be the following:

median(5, 6, 7) - median(1, 2, 3) = 4.

推荐答案

您有7个试图拆分为四分位数的数字.由于7不能被4整除,因此有几种不同的方法可以做到,如此处所述.

You have 7 numbers which you are attempting to split into quartiles. Because 7 is not divisible by 4 there are a couple of different ways to do this as mentioned here.

您的方式是该链接给出的第一种方式,wolfram alpha似乎正在使用第三种方式. Numpy基本上执行与Wolfram相同的操作,但是它基于百分位数进行插值(如此处)而不是四分位数它得到了不同的答案.您可以使用插值选项选择numpy的处理方式(我尝试链接至文档,但显然每个帖子只允许两个链接).

Your way is the first given by that link, wolfram alpha seems to be using the third. Numpy is doing basically the same thing as wolfram however its interpolating based on percentiles (as shown here) rather than quartiles so its getting a different answer. You can choose how numpy handles this using the interpolation option (I tried to link to the documentation but apparently I'm only allowed two links per post).

您必须选择适合您的应用程序的定义.

You'll have to choose which definition you prefer for your application.

这篇关于四分位数间距应如何在Python中计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆