Python的 - 的COM preSS ASCII字符串 [英] Python - Compress Ascii String

查看:112
本文介绍了Python的 - 的COM preSS ASCII字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在寻找一种方式来COM preSS基于ASCII字符串,任何帮助?

我还需要DECOM preSS它。我试着zlib的,但没有帮助。

我能做些什么,以COM preSS的字符串转换成小的长度?

code:

  DEF COM preSS(要求):
    如果request.POST:
        数据= request.POST.get(输入)
        如果is_ascii(数据):
            结果= zlib.com preSS(数据)
            返回render_to_response('的index.html',{结果:结果,输入:数据},context_instance = RequestContext的(要求))
        其他:
            结果=错误,该字符串不是基于ASCII
            返回render_to_response('的index.html',{结果:结果},context_instance = RequestContext的(要求))
    其他:
        返回render_to_response('的index.html',{},context_instance = RequestContext的(要求))
 

解决方案

使用COM pression不会总是减少字符串的长度!

考虑以下code;

 进口的zlib
进口BZ2

高清comptest(S):
    打印的原始长度:,LEN(S)
    打印zlib的COM pressed长度:,LEN(zlib.com preSS(S))
    打印'BZ2 COM pressed长度:,LEN(bz2.com preSS(S))
 

让我们试试这个在一个空字符串;

  [15]:comptest('')
原长:0
zlib的COM pressed长度:8
BZ2 COM pressed长度:14
 

所以的zlib 产生一个额外的8个字符,而 BZ2 14的COM pression方法通常放在COM $ P $面前'头'pssed数据由DECOM pression程序中使用。此头增加输出的长度。

让我们测试一个字;

 在[16]:comptest(测试)
原始长度:4
zlib的COM pressed长度:12
BZ2 COM pressed长度:40
 

即使将substract报头的长度时,COM pression尚未作出在全部字短。这是因为在这种情况下,很少有玉米preSS的。最字符串中的字符仅出现一次。现在简短的句子;

 在[17]:comptest(这是一个短句子的COM pression测试)
原始长度:47
zlib的COM pressed长度:52
BZ2 COM pressed长度:73
 

同样的COM pression输出的的大于输入的文本。由于文本长度有限,很少有重复的,所以它不会COM preSS好。

您需要的文本为COM pression一个相当长的块,以实际工作;

 在[22]:环='''
   ....:三环为天下的精灵 - 王,
   ....:七的矮人领主的石他们的大厅,
   ....:九为凡人注定会死,
   ....:一为黑魔王对他的黑暗王座
   ....:在魔多的土地上的阴影所在。
   ....:一环统治他们,一环去找他们,
   ....:魔戒把他们所有和绑定他们的黑暗
   ....:在魔多的土地上的阴影在于'''。

在[23]:comptest(环)
原始长度:410
zlib的COM pressed长度:205
BZ2 COM pressed长度:248
 

I'm looking for a way to compress an ascii-based string, any help?

I also need to decompress it. I tried zlib but with no help.

What can I do to compress the string into lesser length?

code:

def compress(request):
    if request.POST:
        data = request.POST.get('input')
        if is_ascii(data):
            result = zlib.compress(data)
            return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request))
        else:
            result = "Error, the string is not ascii-based"
            return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request))
    else:
        return render_to_response('index.html', {}, context_instance = RequestContext(request))

解决方案

Using compression will not always reduce the length of a string!

Consider the following code;

import zlib
import bz2

def comptest(s):
    print 'original length:', len(s)
    print 'zlib compressed length:', len(zlib.compress(s))
    print 'bz2 compressed length:', len(bz2.compress(s))

Let's try this on an empty string;

In [15]: comptest('')
original length: 0
zlib compressed length: 8
bz2 compressed length: 14

So zlib produces an extra 8 characters, and bz2 14. Compression methods usually put a 'header' in front of the compressed data for use by the decompression program. This header increases the length of the output.

Let's test a single word;

In [16]: comptest('test')
original length: 4
zlib compressed length: 12
bz2 compressed length: 40

Even if you would substract the length of the header, the compression hasn't made the word shorter at all. That is because in this case there is little to compress. Most of the characters in the string occur only once. Now for a short sentence;

In [17]: comptest('This is a compression test of a short sentence.')
original length: 47
zlib compressed length: 52
bz2 compressed length: 73

Again the compression output is larger than the input text. Due to the limited length of the text, there is little repetition in it, so it won't compress well.

You need a fairly long block of text for compression to actually work;

In [22]: rings = '''
   ....:     Three Rings for the Elven-kings under the sky, 
   ....:     Seven for the Dwarf-lords in their halls of stone, 
   ....:     Nine for Mortal Men doomed to die, 
   ....:     One for the Dark Lord on his dark throne 
   ....:     In the Land of Mordor where the Shadows lie. 
   ....:     One Ring to rule them all, One Ring to find them, 
   ....:     One Ring to bring them all and in the darkness bind them 
   ....:     In the Land of Mordor where the Shadows lie.'''

In [23]: comptest(rings)                       
original length: 410
zlib compressed length: 205
bz2 compressed length: 248

这篇关于Python的 - 的COM preSS ASCII字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆