最大限度地减少重新presentative整数错误的总和 [英] Minimize the sum of errors of representative integers

查看：187 发布时间：2015/11/30 16:17:05 arrays algorithm optimization combinations

本文介绍了最大限度地减少重新presentative整数错误的总和的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给定n之间的整数[0,10000]为D <子> 1 ，D ₂ ...，D <子> N ，其中可能有重复，n可以是巨大的：

Given n integers between [0,10000] as D₁,D₂...,D_n, where there may be duplicates, and n can be huge:

我想找到k个不同的重presentative整数（例如K = 5）之间[0,10000]的R <子> 1 ，R ₂ ,. ..，R <子> K ，因此，所有的再presentative整数误差之和最小化。

I want to find k distinct representative integers (for example k=5) between [0,10000] as R₁,R₂,...,R_k, so the sum of errors of all the representative integers is minimized.

再presentative整数的误差定义如下：

The error of a representative integer is defined below:

假设我们的K重presentative整数按升序排列为{红<子> 1 ，R ₂ ...，R <子> K }，R <子>的错误我是：

Assuming we have k representative integers in ascending order as {R₁,R₂...,R_k}, the error of R_i is :

和我希望尽量减少k个重presentative整数误差之和：

and i want to minimize the sum of errors of the k representative integers:

这可怎么有效地完成？

EDIT1：第k重presentative整数的最小的必须是最小的数 {D <子> 1 ，D ₂ ...，D <子> N }

The smallest of the k representative integers must be the smallest number in {D₁,D₂...,D_n}

EDIT2：第k重presentative整数的最大的一定是最大的数量{D <子> 1 ，D ₂。 ..，D <子> N }加1。例如，当数量最多{D <子> 1 ，D ₂ ...，D <子> N }是9787那么R <子> K 是9788。

The largest of the k representative integers must be the largest number in {D₁,D₂...,D_n} plus 1. For example when the largest number in {D₁,D₂...,D_n} is 9787 then R_k is 9788.

EDIT3：的一个具体的例子是在这里：

A concrete example is here:

（D）= {} 1,3,3,7,8,14,14,14,30和如果k = 5和R被选为{1,6,10,17,31}然后误差的和是：

D={1,3,3,7,8,14,14,14,30} and if k=5 and R is picked as {1,6,10,17,31} then the sum of errors is :

总和=（1-1）+（3-1）* 2 +（7-6）+（8-6）+（14-10）* 3 +（30-17）= 32

sum of errors=(1-1)+(3-1)*2+(7-6)+(8-6)+(14-10)*3+(30-17)=32

这是因为1所述; = 1,3,3＆10 6，第6版; = 7,8小于10，10其中; = 14,14,14＆其中; 17,17其中; = 30℃，31

this is because 1<=1,3,3<6 , 6<=7,8<10, 10<=14,14,14<17, 17<=30<31

推荐答案

虽然来自社会的帮助下，你已经成功地说明你的在一个形式问题，这是可以理解的数学，你还是做不能提供足够的信息，使我（或其他人）给你一个明确的答案（我会张贴这作为注释，但由于某种原因，我没有看到添加注释选项可用于我）。为了给一个很好的回答这个问题，我们需要知道 n和k与彼此和10000的相对尺寸（或对D <子>我如果不是10000），并确定是否预期最大你达到确切最低（即使这是关键需要的时间进行计算的高昂量），或者如果一个近似将是美好的，也（如果是的话，如何接近你需要得到）。此外，为了知道在算法运行什么最短的时间，我们需要有什么样的了解硬件将运行该算法（即我们是否有M处理器内核并行运行上和什么是相对米的大小为k）

Although with help from the community you have managed to state your problem in a form which is understandable mathematically, you still do not supply enough information to enable me (or anyone else) to give you a definitive answer (I would have posted this as a comment, but for some reason I don't see the "add comment" option is available to me). In order to give a good answer to this problem, we need to know the relative sizes of n and k versus each other and 10000 (or the expected maximum of the D_i if it isn't 10000), and whether it is critical that you attain the exact minimum (even if this requires an exorbitant amount of time for the calculation) or if a close approximation would be OK, also (and if so, how close do you need to get). In addition, in order to know what algorithm runs in minimum time, we need to have an understanding about what kind of hardware is going to run the algorithm (i.e., do we have m CPU cores to run on in parallel and what is the size of m relative to k).

的另一个重要信息是这个问题是否会解决只有一次，或者它必须解决很多次，但存在着整数Ð<子>的分布之间有着某种联系我从一个问题到下一个（例如，整数 ð<子>我都是随机样本从一个特定的，不变概率分布，或者每个连续问题有作为其输入不断增加的集合是集合中从previous 问题加上一个额外的取值的整数）。

Another important piece of information is whether this problem will be solved only once, or it must be solved many times but there exists some connection between the distributions of the integers D_i from one problem to the next (e.g., the integers D_i are all random samples from a particular, unchanging probability distribution, or perhaps each successive problem has as its input an ever-increasing set which is the set from the previous problem plus an extra s integers).

没有合理的算法您的问题应该及时运行哪个取决于在某种程度上大于n的线性比，由于建立的直方图 n个整数的D- <子>我需要O（n）的时间，并回答优化问题本身只依赖于的直方图的整数，而不是他们的排序。没有算法可以在时间运行小于为O（n）时，由于这是问题的输入的大小。

No reasonable algorithm for your problem should run in time which depends in a way greater than linear in n, since building a histogram of the n integers D_i requires O(n) time, and the answer to the optimization problem itself is only dependent on the histogram of the integers and not on their ordering. No algorithm can run in time less than O(n), since that is the size of the input of the problem.

一个强力搜索所有的可能性需要，（假设对D <子>的至少一个我是0和另一个是10000），用于小K，说K＆LT; 10，大约O（10000 ^K-2）的时间，所以，如果登录₁₀（N）>> 4（K-2），这是优化算法（因为在这种情况下，时间为蛮力搜索微不足道相比时间读取输入）。这也是有趣地注意到，如果k是非常接近10000，那么蛮力搜索只需要 O（10000 ^10002-K）（因为我们可以搜索，而不是在整数这是的没有的作为再presentative整数）。

A brute force search over all of the possibilities requires, (assuming that at least one of the D_i is 0 and another is 10000), for small k, say k < 10, approximately O(10000^k-2) time, so if log₁₀(n) >> 4(k-2), this is the optimal algorithm (since in this case the time for the brute force search is insignificant compared to the time to read the input). It is also interesting to note that if k is very close to 10000, then a brute force search only requires O(10000^10002-k) (because we can search instead over the integers which are not used as representative integers).

如果你更多的信息更新该问题的定义，我将试图修改我又回答。

If you update the definition of the problem with more information, I will attempt to edit my answer in turn.

这篇关于最大限度地减少重新presentative整数错误的总和的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

最大限度地减少重新presentative整数错误的总和 [英] Minimize the sum of errors of representative integers

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

最大限度地减少重新presentative整数错误的总和 [英] Minimize the sum of errors of representative integers

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭