如何计算python中输出的均值、众数、方差、标准差等? [英] How to calculate mean, mode, variance, standard deviation etc. of output in python?

查看:92
本文介绍了如何计算python中输出的均值、众数、方差、标准差等?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基于概率的简单游戏,每天我们抛硬币,如果正面朝上,我们就赢了,我们得到 20 美元;如果我们抛硬币,得到反面,那么我们最后输掉 19 美元本月(28 天),我们会看到我们损失或赚了多少.

I have a simple game which is based on probabilities, every day we toss a coin and if we get heads then we win and we get $20 and if we toss the coin and we get tails then we lose $19, at the end of the month (28 days) we see how much we have lost or made.

def coin_tossing_game():
    random_numbers = [random.randint(0, 1) for x in range(500)] #generate 500 random numbers
    for x in random_numbers:
        if x == 0: #if we get heads
            return 20 #we win $20
        elif x == 1: #if we get tails
            return -19 #we lose $19


for a in range(1, 28): #for each day of the month
    print(coin_tossing_game())

这将返回输出 2020-19-19-19-19-1920-1920-1920-192020-19-192020-19-19-19202020-19-19-192020

This returns the output 20 20 -19 -19 -19 -19 -19 20 -19 20 -19 20 -19 20 20 -19 -19 20 20 -19 -19 -19 20 20 20 -19 -19 -19 20 20

这个输出正是我所期望的.我想找到输出和其他描述性统计数据的总和,如均值、众数、中位数、标准差、置信区间等.我不得不复制并粘贴这些数据到 excel 来进行数据分析.我希望有一种方法可以在 python 中快速轻松地做到这一点.

This output is exactly what I expected. I want to find the sum of the output and other descriptive statistics like the mean, mode, median, standard deviation, confidence intervals etc. I have had to copy and paste this data to excel to do this data analysis. I was hoping there was a way to easily do this in python quickly.

推荐答案

你在问如何.最直接可用的是以 statistics 库的形式构建到 Python 中.但同样,您似乎想知道如何做到这一点.下面的代码展示了我近 50 年来都没有觉得有必要做的基础知识.

You're asking how. The most immediately available is build into Python in the form of the statistics library. But again, you seem to want to know how to do this. The following code shows the basics, which I haven't felt the need to do for almost 50 years.

首先,修改您的代码,使其捕获向量中的样本.在我的代码中,它被称为 sample.

First, modify your code so that it captures the sample in a vector. In my code it's called sample.

代码的第一部分只是简单地练习了 Python 库.没有汗.

The first part of the code simply exercises the Python library. No sweat there.

代码的第二部分展示了如何累加样本中的值的总和,以及它们与均值的偏差的平方和.我让你来计算如何在这些统计数据的通常假设下计算样本方差、样本标准偏差和置信区间.对样本进行排序和重命名后,我计算了最大值和最小值(对于某些分布的估计很有用).最后我计算排序样本的中位数.我把中位数的计算留给你.

The second part of the code shows how to accumulate the sum of the values in the sample, and the sum of the squares of their deviations from the mean. I leave it to you to work out how to calculate the sample variance, sample standard deviation and confidence intervals under the usual assumptions from these statistics. Having sorted and renamed the sample I calculate the maximum and minimum values (useful for estimation for some distributions). Finally I calculate the median from the sorted sample. I leave calculation of the median to you.

import random

def coin_tossing_game():
    random_numbers = [random.randint(0, 1) for x in range(500)] #generate 500 random numbers
    for x in random_numbers:
        if x == 0: #if we get heads
            return 20 #we win $20
        elif x == 1: #if we get tails
            return -19 #we lose $19

sample = []
for a in range(1, 28): #for each day of the month
    #~ print(coin_tossing_game())
    sample.append(coin_tossing_game())

## the easy way

import statistics

print (statistics.mean(sample))
print (statistics.median(sample))
print (statistics.mode(sample))
print (statistics.stdev(sample))
print (statistics.variance(sample))

## the hard way

sample.sort()
orderedSample = sample
N = len(sample)
minSample = orderedSample[0]
maxSample = orderedSample[-1]
sumX = 0
for x in sample:
    sumX += x
mean = sumX / N

sumDeviates2 = 0
for x in sample:
    sumDeviates2 += ( x-mean )**2

k = N//2
if N%2==0:
    mode = 0.5* (orderedSample[k]+orderedSample[k-1])
else:
    mode = orderedSample[k]

这篇关于如何计算python中输出的均值、众数、方差、标准差等?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆