adler32滚动校验和的计算差异-python [英] Differences in calculation of adler32 rolling checksum - python

查看:363
本文介绍了adler32滚动校验和的计算差异-python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在计算运行中的校验和时需要澄清.

Need a clarification while looking at calculating a running checksum.

假设我有这样的数据.

data = 'helloworld'

假设块大小为5,我需要计算运行校验和.

Assuming a blocksize of 5, I need to calculate running checksum.

>>> zlib.adler32('hello')
103547413
>>> zlib.adler32('ellow')
105316900

根据Python文档(Python版本2.7.2)

According to Python documentation (python version 2.7.2)

zlib.adler32(data[, value])

计算数据的Adler-32校验和.(Adler-32校验和几乎是 与CRC32一样可靠,但可以更快地计算出来.)如果 值存在,用作校验和的起始值; 否则,使用固定的默认值.这允许计算 在多个输入的串联上运行校验和."

"Computes a Adler-32 checksum of data. (An Adler-32 checksum is almost as reliable as a CRC32 but can be computed much more quickly.) If value is present, it is used as the starting value of the checksum; otherwise, a fixed default value is used. This allows computing a running checksum over the concatenation of several inputs."

但是当我提供这样的内容时,

But when I provide something like this,

>>> zlib.adler32('ellow', zlib.adler32('hello'))
383190072

输出完全不同.

我尝试创建一个自定义函数来生成rsync算法中定义的滚动校验和.

I tried creating a custom function to generate the rolling checksum as defined in the rsync algorithm.

def weakchecksum(data):
    a = 1
    b = 0

    for char in data:
        a += (ord(char)) % MOD_VALUE
        b += a % MOD_VALUE



    return (b << 16) | a



def rolling(checksum, removed, added, block_size):
    a = checksum
    b = (a >> 16) & 0xffff
    a &= 0xffff

    a = (a - ord(removed) + ord(added)) % MOD_VALUE
    b = (b - (block_size * ord(removed)) + a) % MOD_VALUE

    return (b << 16) | a

这是我通过运行这些功能得到的值

Here is the values that I get from running these functions

Weak for hello: 103547413
Rolling for ellow: 105382436
Weak for ellow: 105316900

正如您所看到的,就价值而言,我的滚动校验和和python的实现存在巨大差异.

As you can see there is some huge difference in my implementation of rolling checksum and python's, in terms of value.

在计算滚动校验和时哪里出错了? 我可以正确利用python adler32函数的rolling属性吗?

Where am I going wrong in calculating the rolling checksum? Am I making use of the rolling property of python's adler32 function correctly?

推荐答案

adler32()函数不提供滚动"功能.该文档正确使用了运行"(不是滚动")一词,这意味着它可以按块而不是一次计算adler32.您需要编写自己的代码来计算滚动" adler32值,该值将是数据上滑动窗口的adler32.

The adler32() function does not provide "rolling". The documentation correctly uses the word "running" (not "rolling"), which means simply that it can compute the adler32 in chunks as opposed to all at once. You need to write your own code to do compute a "rolling" adler32 value, which would be the adler32 of a sliding window over the data.

这篇关于adler32滚动校验和的计算差异-python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆