理论上可能的最大压缩率是多少? [英] What is the maximum theoretically possible compression rate?
问题描述
这是一个理论问题,所以希望这里的许多细节在实践中甚至在理论上都是不可计算的。
This is a theoretical question, so expect that many details here are not computable in practice or even in theory.
假设我有一个字符串 s
我想压缩。结果应该是一个自解压二进制(可以是x86汇编程序,但也可以是一些其他假设的图灵完成低级语言),输出 s
。
Let's say I have a string s
that I want to compress. The result should be a self-extracting binary (can be x86 assembler, but it can also be some other hypothetical Turing-complete low level language) which outputs s
.
现在,我们可以轻松地遍历所有可能的这种二进制和程序,按大小排序。让 B_s
是这些输出 s
的二进制文件的子列表(当然 B_s
是不可计算的)。
Now, we can easily iterate through all possible such binaries and programs, ordered by size. Let B_s
be the sub-list of these binaries who output s
(of course B_s
is uncomputable).
由于每个正整数集必须有一个最小值,所以必须有一个最小的程序 b_min_s
在 B_s
。
As every set of positive integers must have a minimum, there must be a smallest program b_min_s
in B_s
.
对于什么语言关于 b_min_s
的大小?也许只是一个估计。 (我可以构造一些简单的例子,我总是甚至可以计算 B_s
以及 b_min_s
)
For what languages (i.e. set of strings) do we know something about the size of b_min_s
? Maybe only an estimation. (I can construct some trivial examples where I can always even calculate B_s
and also b_min_s
, but I am interested in more interesting languages.)
推荐答案
这是 Kolmogorov复杂性,您确定无法计算。如果是,你可以创建一个长度为n的矛盾程序,打印一个包含Kolmogorov复杂度m> n的字符串。
This is Kolmogorov complexity, and you are correct that it's not computable. If it were, you could create a paradoxical program of length n that printed a string with Kolmogorov complexity m > n.
显然,你可以绑定 b_min_s
。但是,据我所知,大多数努力都是存在的证据。例如,目前正在进行竞争以压缩英语维基百科。
Clearly, you can bound b_min_s
for given inputs. However, as far as I know most of the efforts to do so have been existence proofs. For instance, there is an ongoing competition to compress English Wikipedia.
这篇关于理论上可能的最大压缩率是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!