迭代哈希在 Python 和 Java 中返回不同的值 [英] Iterative hashing returns different values in Python and Java

查看:55
本文介绍了迭代哈希在 Python 和 Java 中返回不同的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 Python (2.7) 脚本移植到 Java.它多次迭代 sha256 哈希,但最终得到不同的结果.我注意到它们第一次返回相同的结果,但从那以后就不同了.

I'm trying to port a python (2.7) script to Java. It iterates a sha256 hash several times but they end up with different results. I've noticed the first time they return the same result, but from there on it differs.

这是 Python 实现:

Here is the Python implementation:

import hashlib

def to_hex(s):
  print " ".join(hex(ord(i)) for i in s)

d = hashlib.sha256()

print "Entry:"
r = chr(1)
to_hex(r)

for i in range(2):
  print "Loop", i
  d.update(r)
  r = d.digest()
  to_hex(r)

在 Java 中:

import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

public class LoopTest {

  public static void main(String[] args) {
    MessageDigest d;
    try {
      d = MessageDigest.getInstance("SHA-256");
    } catch (NoSuchAlgorithmException e) {
      System.out.println("NoSuchAlgorithmException");
      return;
    }

    System.out.println("Entry:");
    byte[] r = new byte[] {1};
    System.out.println(toHex(r));

    for(int i = 0; i < 2; i++) {
      System.out.printf("Loop %d\n", i);
      d.update(r);
      r = d.digest();
      System.out.println(toHex(r));
    }
  }

  private static String toHex(byte[] bytes) {
    StringBuilder sb = new StringBuilder(bytes.length);
    for (byte b: bytes) {
       sb.append(String.format("0x%02X ", b));
    }
    return sb.toString();
  }
}

输出是,对于python:

The outputs are, for python:

$ python looptest.py
Entry:
0x1
Loop 0
0x4b 0xf5 0x12 0x2f 0x34 0x45 0x54 0xc5 0x3b 0xde 0x2e 0xbb 0x8c 0xd2 0xb7 0xe3 0xd1 0x60 0xa 0xd6 0x31 0xc3 0x85 0xa5 0xd7 0xcc 0xe2 0x3c 0x77 0x85 0x45 0x9a
Loop 1
0x98 0x1f 0xc8 0xd4 0x71 0xa8 0xb0 0x19 0x32 0xe3 0x84 0xac 0x1c 0xd0 0xa0 0x62 0xc4 0xdb 0x2c 0xe 0x13 0x58 0x61 0x9a 0x83 0xd1 0x67 0xf5 0xe8 0x4e 0x6a 0x17

对于Java:

$ java LoopTest
Entry:
0x01
Loop 0
0x4B 0xF5 0x12 0x2F 0x34 0x45 0x54 0xC5 0x3B 0xDE 0x2E 0xBB 0x8C 0xD2 0xB7 0xE3 0xD1 0x60 0x0A 0xD6 0x31 0xC3 0x85 0xA5 0xD7 0xCC 0xE2 0x3C 0x77 0x85 0x45 0x9A
Loop 1
0x9C 0x12 0xCF 0xDC 0x04 0xC7 0x45 0x84 0xD7 0x87 0xAC 0x3D 0x23 0x77 0x21 0x32 0xC1 0x85 0x24 0xBC 0x7A 0xB2 0x8D 0xEC 0x42 0x19 0xB8 0xFC 0x5B 0x42 0x5F 0x70

造成这种差异的原因是什么?

What could be the reason for this difference?

感谢@dcsohl 和@Alik 的回答,我现在明白原因了.由于我要将 Python 脚本移植到 Java,因此我必须保持 Python 脚本的原样,因此我像这样修改了 Java 程序:

Thanks for the answers @dcsohl and @Alik I understand the reason now. Since I'm porting the Python script to Java I had to keep the Python one as it is so I modified the Java program like this:

byte[] r2 = new byte[]{};
for(int i = 0; i < 2; i++) {
  System.out.printf("Loop %d\n", i);
  d.update(r);
  r2 = d.digest();
  System.out.println(toHex(r2));
  byte[] c = new byte[r.length + r2.length];
  System.arraycopy(r, 0, c, 0, r.length);
  System.arraycopy(r2, 0, c, r.length, r2.length);
  r = c;
}

推荐答案

这两种语言以不同的方式运行 update()digest().

The two languages run update() and digest() differently.

update() 的 Python 文档说

The python documentation for update() says

用字符串 arg 更新哈希对象.重复调用相当于将所有参数串联起来的单个调用: m.update(a);m.update(b) 等价于 m.update(a+b).

Update the hash object with the string arg. Repeated calls are equivalent to a single call with the concatenation of all the arguments: m.update(a); m.update(b) is equivalent to m.update(a+b).

我使用 shell sha256sum 命令对此进行了测试.

I tested this by using the shell sha256sum command.

echo -n '\0x01\0x4b\0xf5\0x12\0x2f\0x34\0x45\0x54\0xc5\0x3b\0xde\0x2e\0xbb\0x8c\0xd2\0xb7\0xe3\0xd1\0x60\0xa\0xd6\0x31\0xc3\0x85\0xa5\0xd7\0xcc\0xe2\0x3c\0x77\0x85\0x45\0x9a' | sha256sum
981fc8d471a8b01932e384ac1cd0a062c4db2c0e1358619a83d167f5e84e6a17 *-

您从 \0x01 开始,因此这是第一个字节,然后其余字节是 0x01 的哈希值.结果哈希与您的 Python 输出相匹配.

You started with \0x01 so that's the first byte, and then the rest of the bytes are the hash of 0x01. The resultant hash matches your Python output.

现在看看这个 - 我省略了最初的 \0x01 并得到了哈希 - 它与您的 Java 输出相匹配.

Now look at this - I omitted the initial \0x01 and got the hash back - it matches your Java output.

> echo -n '\0x4b\0xf5\0x12\0x2f\0x34\0x45\0x54\0xc5\0x3b\0xde\0x2e\0xbb\0x8c\0xd2\0xb7\0xe3\0xd1\0x60\0xa\0xd6\0x31\0xc3\0x85\0xa5\0xd7\0xcc\0xe2\0x3c\0x77\0x85\0x45\0x9a' | sha256sum
9c12cfdc04c74584d787ac3d23772132c18524bc7ab28dec4219b8fc5b425f70 *-

但是为什么?不应该包括初始的 \0x01 吗?除了 digest() 的 javadoc 说:

But why? Shouldn't the initial \0x01 be included? It would be, except that the javadoc for digest() says:

通过执行填充等最终操作来完成哈希计算.进行此调用后,摘要将重置.

Completes the hash computation by performing final operations such as padding. The digest is reset after this call is made.

因此,当您在 Java 中调用 digest() 时,初始的 \0x01 会被删除,并且您只是在消化没有初始 \0x01 条目的旧摘要.

So your initial \0x01 gets dropped when you call digest() in java, and you are simply digesting the old digest without the initial \0x01 entry.

这篇关于迭代哈希在 Python 和 Java 中返回不同的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆