编码挑战:滚动平均值和标准差 [英] Coding challenge: rolling average and standard deviation

查看:176
本文介绍了编码挑战:滚动平均值和标准差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定可能永远不会结束的浮点数据流(将政治家的语音转换为二进制并转换为4字节浮点数),计算滚动平均值和标准差。有关这方面的一些背景信息可以在高效准确的滚动标准偏差 - Mindful Programmer [ ^ ]。您面临的挑战是如何有效地编码。或者可笑。您的电话。



上周:



第一:非常感谢格里夫为他的谜题。非常好。



Graeme在上周的挑战赛中再次入手。我们需要给他一个让步吗?也许类似你的代码不能包含字母E或+符号。声音公平吗?

Given a stream of floating point data that may never end (think of a politician's speech converted to binary and cast to 4 byte floats), calculate a rolling average and standard deviation. Some background on this can be found at Efficient and accurate rolling standard deviation – The Mindful Programmer[^]. Your challenge is to code this efficiently. Or ridiculously. Your call.

Last week:

First: huge thanks to Griff for his puzzle. Very nice.

Graeme has once again nailed it in last week's challenge. Do we need to give him a handicap? Maybe something like "your code cannot contain the letter E or the + sign. Sound fair?

推荐答案

这是一个C#版本我敲了一会儿:

Here's a C# version I knocked up a while back:
public struct StandardDeviationData
{
    private readonly double _sum;
    private readonly double _sumOfSquares;

    private StandardDeviationData(uint count, double sum, double sumOfSquares)
    {
        Count = count;
        _sum = sum;
        _sumOfSquares = sumOfSquares;
    }

    public uint Count { get; }
    
    public double Average => (Count == 0) ? double.NaN : _sum / Count;
    
    // The uncorrected sample standard deviation:
    // https://en.wikipedia.org/wiki/Standard_deviation#Uncorrected_sample_standard_deviation
    public double UncorrectedSampleStandardDeviation
    {
        get
        {
            if (Count == 0) return double.NaN;

            var diff = _sumOfSquares - (_sum * _sum / Count);
            return Math.Sqrt(diff / Count);
        }
    }

    // The corrected sample standard deviation:
    // https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation
    public double CorrectedSampleStandardDeviation
    {
        get
        {
            if (Count < 2) return double.NaN;

            var diff = _sumOfSquares - (_sum * _sum / Count);
            return Math.Sqrt(diff / (Count - 1));
        }
    }

    public StandardDeviationData Add(double value)
    {
        return new StandardDeviationData(checked(Count + 1), _sum + value, _sumOfSquares + (value * value));
    }

    public static StandardDeviationData operator +(StandardDeviationData data, double value)
    {
        return data.Add(value);
    }

    public static StandardDeviationData Create(IReadOnlyCollection<double> values)
    {
        double sum = 0, sumOfSquares = 0;
        foreach (double x in values)
        {
            sum += x;
            sumOfSquares += x * x;
        }
        
        return new StandardDeviationData((uint)values.Count, sum, sumOfSquares);
    }
}

public static class Extensions
{
    private static IEnumerable<StandardDeviationData> RollingStandardDeviationIterator(IEnumerable<double> source)
    {
        var result = default(StandardDeviationData);
        foreach (double value in source)
        {
            result += value;
            yield return result;
        }
    }

    public static IEnumerable<StandardDeviationData> RollingStandardDeviation(this IEnumerable<double> source)
    {
        if (source == null) throw new ArgumentNullException(nameof(source));
        return RollingStandardDeviationIterator(source);
    }
    
    public static IEnumerable<StandardDeviationData> RollingStandardDeviation(this IEnumerable<double?> source)
    {
        if (source == null) throw new ArgumentNullException(nameof(source));
        return RollingStandardDeviationIterator(source.Where(n => n != null).Select(n => n.GetValueOrDefault()));
    }
    
    private static IEnumerable<StandardDeviationData> RollingStandardDeviationIterator(IEnumerable<double> source, int windowSize)
    {
        var window = new Queue<double>(windowSize);
        
        foreach (double value in source)
        {
            if (window.Count == windowSize)
            {
                window.Dequeue();
            }
            
            window.Enqueue(value);
            yield return StandardDeviationData.Create(window);
        }
    }

    public static IEnumerable<StandardDeviationData> RollingStandardDeviation(this IEnumerable<double> source, int windowSize)
    {
        if (source == null) throw new ArgumentNullException(nameof(source));
        if (windowSize < 2) throw new ArgumentOutOfRangeException(nameof(windowSize));
        return RollingStandardDeviationIterator(source, windowSize);
    }
    
    public static IEnumerable<StandardDeviationData> RollingStandardDeviation(this IEnumerable<double?> source, int windowSize)
    {
        if (source == null) throw new ArgumentNullException(nameof(source));
        if (windowSize < 2) throw new ArgumentOutOfRangeException(nameof(windowSize));
        return RollingStandardDeviationIterator(source.Where(n => n != null).Select(n => n.GetValueOrDefault()), windowSize);
    }
}



编辑:现已更新以支持滚动窗口。是的,它通过哈罗德的测试 [ ^ ] 。 :)


Now updated to support a rolling window. And yes, it passes Harold's test[^] from The Lounge. :)


我以为我会用普通(多语言)风格把帽子扔进戒指但只有一个解决方案......;)



除了非常小的内存占用之外,这个解决方案的一个优点是你可以使用一个或多个无限的数据流,而不仅仅是一个固定窗口的固定数组。



Thought that I would throw my hat into the ring in my normal (multi-language) style but a single solution only... ;)

One advantage of this solution, besides the very small memory footprint, is that you can use one or more endless streams of data, not just a fixed array with a fixed window.




using System;
using System.Linq;

namespace RollingSD
{
    class Program
    {
        static void Main(string[] args)
        {
            double[] data = { 1E30f, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 };

            Console.WriteLine("Running");
            Console.WriteLine("-------\r\n");

            int count = 0;
            var avg = 0.0;
            var t = new Tuple<double, double, int>(0.0, 0.0, 0);
            for (int i = 0; i < data.Length; i++)
            {
                avg = avg.Average(data[i], ref count);
                Console.WriteLine(


数据集:{{{string .Join( ,data.Take(i + 1))}}});
Console.WriteLine(
"Dataset: {{{string.Join(", ", data.Take(i + 1))}}}"); Console.WriteLine(


这篇关于编码挑战:滚动平均值和标准差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆