假设未知分布，计算样本数据的置信区间 [英] Compute a confidence interval from sample data assuming unknown distribution

查看：22 发布时间：2021/12/31 11:56:21 python scipy statistics statsmodels confidence-interval

本文介绍了假设未知分布，计算样本数据的置信区间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有样本数据，我想为其计算置信区间，假设分布不正常且未知.基本上，看起来分布是帕累托但我不确定.

I have sample data which I would like to compute a confidence interval for, assuming a distribution is not normal and is unknown. Basically, it looks like distribution is Pareto but I don't know for sure.

正态分布的答案:

从样本数据计算置信区间

使用 scipy 获取置信区间的正确方法

推荐答案

如果您不了解底层发行版，那么我的第一个想法是使用引导:https://en.wikipedia.org/wiki/Bootstrapping_(statistics)

If you don't know the underlying distribution, then my first thought would be to use bootstrapping: https://en.wikipedia.org/wiki/Bootstrapping_(statistics)

在伪代码中，假设 x 是一个包含数据的 numpy 数组:

In pseudo-code, assuming x is a numpy array containing your data:

import numpy as np
N = 10000
mean_estimates = []
for _ in range(N):
    re_sample_idx = np.random.randint(0, len(x), x.shape)
    mean_estimates.append(np.mean(x[re_sample_idx]))

mean_estimates 现在是分布均值的 10000 个估计值列表.取这 10000 个值的第 2.5 个和第 97.5 个百分位数，您就有了一个围绕数据均值的置信区间:

mean_estimates is now a list of 10000 estimates of the mean of the distribution. Take the 2.5th and 97.5th percentile of these 10000 values, and you have a confidence interval around the mean of your data:

sorted_estimates = np.sort(np.array(mean_estimates))
conf_interval = [sorted_estimates[int(0.025 * N)], sorted_estimates[int(0.975 * N)]]

这篇关于假设未知分布，计算样本数据的置信区间的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

假设未知分布，计算样本数据的置信区间 [英] Compute a confidence interval from sample data assuming unknown distribution

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

假设未知分布，计算样本数据的置信区间 [英] Compute a confidence interval from sample data assuming unknown distribution

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭