python scipy stats 帕累托拟合:它是如何工作的 [英] python scipy stats pareto fit: how does it work

查看：79 发布时间：2021/7/16 20:56:47 python scipy power-law

本文介绍了python scipy stats 帕累托拟合:它是如何工作的的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

... 帮助和在线文档说函数 scipy.stats.pareto.fit 将要拟合的数据集作为变量，以及可选的 b(指数)、loc、比例.结果是三元组(指数、位置、比例)

... help and online documentation say the function scipy.stats.pareto.fit takes as variables the dataset to be fitted, and optionally b (the exponent), loc, scale. the result comes as triplet (exponent, loc, scale)

从相同分布生成数据应该导致拟合找到用于生成数据的参数，例如(使用python 3 colsole)

generating data from the same distribution should result in the fit finding the parameters used for generating the data, e.g. (using the python 3 colsole)

$  python
Python 3.3.0 (default, Dec 12 2012, 07:43:02) 
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

(在下面的代码行中省略了 python 控制台提示>>>")

(in code lines below leaving out the python console prompt ">>>")

dataset=scipy.stats.pareto.rvs(1.5,size=10000)  #generating data
scipy.stats.pareto.fit(dataset)

然而这会导致

(1.0, nan, 0.0)

(指数 1，应为 1.5)和

(exponent 1, should be 1.5) and

dataset=scipy.stats.pareto.rvs(1.1,size=10000)  #generating data
scipy.stats.pareto.fit(dataset)

结果

(1.0, nan, 0.0)

(指数 1，应为 1.1)和

(exponent 1, should be 1.1) and

dataset=scipy.stats.pareto.rvs(4,loc=2.0,scale=0.4,size=10000)    #generating data
scipy.stats.pareto.fit(dataset)

(指数应为 4，loc 应为 2，比例应为 0.4)in

(exponent should be 4, loc should be 2, scale should be 0.4) in

(1.0, nan, 0.0)

等等.调用拟合函数时给出另一个指数

etc. giving another exponent when calling the fit function

scipy.stats.pareto.fit(dataset,1.4)

总是准确地返回这个指数

returns always exactly this exponent

(1.3999999999999999, nan, 0.0)

显而易见的问题是:我是否完全误解了这个 fit 函数的目的，它的使用方式是否有所不同，还是只是被破坏了?

The obvious question would be: do I misunderstand the purpose of this fit function completely, is it used somehow differently, or is it simply broken?

备注:在有人提到 Aaron Clauset 的网页(http://tuvalu.santafe.edu/~aaronc/powerlaws/) 比 scipy.stats 方法更可靠，应该改用:这可能是真的，但它们也非常非常非常非常耗时对于 10000 个点的数据集，在普通 PC 上需要很多小时(可能是几天、几周、几年).

a remark: before someone mentions that dedicated functions like those given on Aaron Clauset's web pages (http://tuvalu.santafe.edu/~aaronc/powerlaws/) are more reliable than the scipy.stats methods and should be used instead: that may be true, but they are also very very very very time consuming and do for datasets of 10000 points take many many hours (maybe days, weeks, years) on a normal PC.

哦:拟合函数的参数不是分布的指数而是指数减1(但这并没有改变上述问题)

edit: oh: the parameter of the fit function is not the exponent of the distribution but exponent minus 1 (but this does not change the above issue)

python scipy stats 帕累托拟合:它是如何工作的 [英] python scipy stats pareto fit: how does it work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python scipy stats 帕累托拟合:它是如何工作的 [英] python scipy stats pareto fit: how does it work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭