从阵列获得10个平均值?这存在吗? [英] Get 10 averages from an array? Does this exist?

查看:94
本文介绍了从阵列获得10个平均值?这存在吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请原谅我,因为老实说,我不知道我会问什么或者我到底想要什么,我想我只是陷入了数学难题,但无论如何它仍然存在......



我有一大组数字,即数组中存储的50k或100k数字(十进制)

它们之间没有区别,它们可能会也可能不会重复,没有限制。



因为它们是一个很大的集合,我需要总结它们,就像平均值那样,但是平均值我只能得到整个阵列的平均值为1,我需要得到10或20个平均值,换句话说,整数组之间的平均值最多10个。



是有这样的操作,如果是这样,它是如何调用的,所以我可以寻找更多信息?

当然我需要能够计算每个平均值或汇总数的点击数





---



再给这个多一点感觉和背景,我试图总结汽车的数据记录,每个框架或条目都带有一个RPM值,当然从0到8000不等,我得到数千条记录,我需要代表它们在rpm表中,每个固定索引的点击量已经收到。



在一个实际例子中,假设我们得到以下值来处理



{10,50,90,50,10,400,450,300,550,900,950,1100,1200,1000,900}



rpm |点击

----- | ---------

100 | 5次点击

500 | 4次点击

1000 | 6次点击



在这个例子中,为了简单起见,我有一些类似的数字,我知道如何计算命中数,并找出 index每个值应该去,但我需要先找出的是表格的最佳索引。

我创建了这3个索引(100,500,1000)固定,但我不知道知道这些是否是分割我的数字的最佳指数,它可能是500或400或474,谁知道。



这就是我正在讨论的情况要做到这一点,你如何找到最好的索引,甚至可以变化,它们可能只有3或者可能是10或N,用户将有输入来细分他们希望的数量的索引。



希望现在更有意义了。



我尝试了什么:



我有一个想法如下,但不确定这是否有意义



获取Array.Max - Array.Min和将结果除以我想要的摘要数,在本例中为10,然后创建10个不同的数组,其中包含该范围内的数字并获得这些平均数。即:



Array.Min = 0

Array.max = 400

需要摘要= 10



创建10个数组,第一个数字从0到40,第二个40到80,第三个80到120等儿子,然后计算每个数组的平均值。 />


我看到的问题是我可能没有200到300范围内的任何数字,所以有些数组将是空的并且它们的平均值没有意义?

Please forgive me because I honestly do nott know what I will ask or what exactly I am lucking for, I guess am just stuck in a Math Dilemma, but here it goes anyway...

I have a large set of numbers, i.e. , 50k or 100k of numbers (decimal) stored in an array
They are not distinct between them, they could or could not repeat, there are no restrictions.

Since they are a large set, I need to summarize them, kind of what the Average does, but with Average i can only get 1 average of the whole array, and I need to get 10 or 20 averages, or in other words the most 10 significant averages between the whole set of numbers.

Is there such operation that can be made, and if so how is it called so i can look for more information?
Of course I would need to be able to count the hits of each average or summary number


---

To give this a little bit more sense and context, I am trying to summarize a datalog from a car, each "frame" or "entry" comes with a RPM value , which of course varies from 0 to 8000, i get thousands of those records and I need to represent them in a rpm table and the amount of hits each "fixed" index has received.

In a practical example, lets assume we got the following values to process

{10,50,90,50,10,400,450,300,550,900,950,1100,1200,1000,900}

rpm | hits
-----|---------
100 | 5 hits
500 | 4 hits
1000 | 6 hits

In this example I have kind of grouped similar numbers for the sake of simplicity, I do know how to calculate the hits, and to find out to which "index" each value should go, but what i need to find out first is what are the best indexes for the table.
I have created those 3 indexes (100,500,1000) fixed, but I don't know if those are the best indexes to split my numbers, it might be 500 or 400 or 474, who knows.

That is the situation i am debating on how to be done, how do you find the best indexes, which can even vary, they could be just 3 or could be 10 or N , the user will have the input to "subdivide" the indexes in the amount they wish.

Hope it is making a little bit more of sense now.

What I have tried:

One of the ideas I had is the following, but not sure if this makes sense at all

Take the Array.Max - Array.Min and divide the result by the numbers of summaries i want to have, in this case 10, and then create 10 different arrays with the numbers in that range and get those averages. i.e:

Array.Min = 0
Array.max = 400
Needed summaries = 10

Create 10 arrays, first with numbers that go from 0 to 40, second 40 to 80, third 80 to 120 and so son, and then calculate each array average.

the problem I see with this is that I could potentially not have any number in the range of 200 to 300, so some arrays will be empty and their average wont make sense?

推荐答案

根据您的0-40,40-80等组,我希望最大值小于 400或有11组。在我的示例中,我将使用小于400的最大值,但如果您想要11个组或者如果您想要例如,则可以轻松调整最后一组包括400.



您可以这样做:



Based on your groups of 0-40, 40-80 etc, I expect that either the max value is less than 400 or there are 11 groups. In my example I'll use a maximum value smaller than 400, but it's easy to adjust if you want 11 groups or if you want to e.g. include 400 in the last group.

You could do something like this:

Dim array = New Decimal() {10, 20, 30, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390}

Dim groups = New List(Of IEnumerable(Of Decimal))
For index As Integer = 0 To 9
    Dim lowerBound = index / 10 * 400
    Dim upperBound = (index + 1) / 10 * 400
    groups.Add(array.Where(Function(n) (n >= lowerBound And n < upperBound)))
Next

Dim averages = groups.Select(Function(g) If(g.Count > 0, g.Average(), -1))





array包含多个值min 0和max< 400。

groups包含值为0到40,40到80等的组。

averages保存每个组的平均值组,如果组为空,则为-1。



如果你不关心空组(我的代码示例中为40-80),你可以更优雅地做到这一点:





"array" holds a number of values with min 0 and max <400.
"groups" holds groups with values 0 to 40, 40 to 80 etc.
"averages" holds the average of each of these groups, or -1 if the group is empty.

If you don't care about the empty groups (40-80 in my code example), you could do this more elegantly:

Dim groups = array.GroupBy(Function(n) Math.Floor(n / 40))
Dim averages = groups.Select(Function(g) g.Average())


引用:

所以有些数组是空的,它们的平均值没有意义吗?

so some arrays will be empty and their average wont make sense?

从技术上讲,你无法计算平均值0个样本。



通常(统计数据),报告每个范围内的样本数量,而不是它们的平均值。请查看频率分布 - 维基百科 [ ^ ]。

Technically you cannot compute the average of 0 samples.

Usually (statistics), for each range is reported the number of samples falling within, not their average. Have a look at Frequency distribution - Wikipedia[^].


首先:我不知道为什么你需要获得10个或更多的平均值阵列...



似乎,你在谈论统计 [ ^ ],尤其是关于平均值 [ ^ ]。至少有几个平均类型 [ ^ ]:算术平均值 [ ^ ],媒体 [ ^ ],几何中位数 [ ^ ],模式(统计) [ ^ ]等等......

每个都提供有关您的数据集的非常具体的信息。因此,根据您想要进行的统计调查,您需要使用相应的方法。

想象一下,您的十进制数字数组表示产品销售的及时(周,月,季,年) )。您可能希望将阵列拆分为子集(按产品或时间)以获取有关销售的更多信息。有时,在大数据分析(市场销售)中,移动(或正在运行)平均值[ ^ ]也被使用。



由于我的英语不足,我无法解释更多...;(
First of all: I have no idea why you need to get 10 or more averages from array...

Seems, you're talking about Statistics[^], especially about concepts of Average[^]. There's at least few types of average[^]: Arithmetic mean[^], Median[^], Geometric median[^], Mode (statistics)[^] and few more...
Each of them provides very specific information about your data set. So, depending on what statistical survey you want to make, you need to use according method.
Imagine, your array of decimal numbers represents set of sale of products in time (weeks, months, quarters, years). You may want to split your array into the sub-sets (by product or time) to get more information about sale. Sometimes, in big data analysis (for market sale), a moving (or running) average[^] is used too.

Due to my weakness of English, i can't explain it more... ;(


这篇关于从阵列获得10个平均值?这存在吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆