你能带code少位,当你不需要preserve订单? [英] Can you encode to less bits when you don't need to preserve order?

查看:113
本文介绍了你能带code少位,当你不需要preserve订单?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设你有32位整数列表和32位整数在一个多重同一个集合(一组允许重复成员)

Say you have a List of 32-bit Integers and the same collection of 32-bit Integers in a Multiset (a set that allows duplicate members)

由于集不preserve订单,但名单做,这是否意味着我们可以连接code一个多重比对表位少?

Since Sets don't preserve order but List do, does this mean we can encode a Multiset in less bits than the List?

如果这样你会如何连接code多重集?

If so how would you encode the Multiset?

如果这是真的什么其他的例子还有哪里不需要为preserve为了节省位?

If this is true what other examples are there where not needing to preserve order saves bits?

请注意,我只是用32位整数作为一个例子。是否在编码的数据类型回事?请问数据类型需要固定长度和可比较的,你得到储蓄?

Note, I just used 32-bit Integers as an example. Does the data type matter in the encoding? Does the data type need to be fixed length and comparable for you to get the savings?

修改

任何解决方案应该具有低重复以及高重复收藏工作。其明显与只是简单地重复计数编码一个多重高重复是很容易的,但是这需要更多的空间,如果集合中没有重复。

Any solution should work well for collections that have low duplication as well as high duplication. Its obvious with high duplication encoding a Multiset by just simply counting duplicates is very easy, but this takes more space if there is no duplication in the collection.

推荐答案

在多集,每个条目是一对数字:整数值,以及它是如何多次在集中使用的计数。这意味着在多集的每个值的额外重复不花更多的存储(你只是增加计数器)。

In the multiset, each entry would be a pair of numbers: The integer value, and a count of how many times it is used in the set. This means additional repeats of each value in the multiset do not cost any more to store (you just increment the counter).

但是(假定两个值都是整数),这将仅是如果每个列表项被重复两次或更多次的平均比简单列表更高效的存储 - 可能有执行本,取决于更有效的或更高的性能的方法范围,稀疏,和重复数被存储。 (例如,如果你知道不会有任何价值超过255个重复,你可以使用一个字节,而不是一个int来存储计数器)

However (assuming both values are ints) this would only be more efficient storage than a simple list if each list item is repeated twice or more on average - There could be more efficient or higher performance ways of implementing this, depending on the ranges, sparsity, and repetitive of the numbers being stored. (For example, if you know there won't be more than 255 repeats of any value, you could use a byte rather than an int to store the counter)

此方法将与任何类型的数据的工作,因为你只是存储多少重复有每个数据项的计数。每个数据项必须是可比(而仅在你知道两个项目是相同或不同的点)。没有必要对项取每个存储相同量的

This approach would work with any types of data, as you are just storing the count of how many repeats there are of each data item. Each data item needs to be comparable (but only to the point where you know that two items are the same or different). There is no need for the items to take the same amount of storage each.

这篇关于你能带code少位,当你不需要preserve订单?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆