R库中MD5哈希的区别-序列化对象的MD5 [英] Difference of MD5 Hash in R-Librarys - MD5 for serialized objects

查看:63
本文介绍了R库中MD5哈希的区别-序列化对象的MD5的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为R对象计算MD5哈希值.这通常是通过序列化的对象完成的.我知道可以计算MD5哈希值的两个differect R库-摘要库和openssl库.但是,这两个返回不同的哈希值.这是openssl库的示例:

I want to calculate a MD5 Hash for an R Object. This is usually done with the serialized object. I am aware of two differect R libs that can calculate MD5 hashes - the digest library and the openssl library. But these two return different hash values. Here is an example fore the openssl library:

test <- 1:100

library(openssl )
md5(serialize(test, connection = NULL))
# returns: md5 23:a8:b3:40:9e:08:a0:3d:30:6e:3d:3d:cb:fe:21:57 

现在是摘要库的示例:

library(digest)
digest(test,"md5",serialize = T)
# returns: [1] "83777773fa047247723ad5a255963144"

为什么这些哈希值不同?

推荐答案

简短答案

如果对象已序列化,

摘要会跳过一些前导位.

Short answer

digest skips some leading bits if the object is serialized.

例如:

> .t <- serialize(test, connection = NULL)
> md5(.t[seq(15, length(.t))])
md5 83:77:77:73:fa:04:72:47:72:3a:d5:a2:55:96:31:44

长答案

如果R版本不同,则 serialize(1:100,connection = NULL)的结果将不同.

根据 base :: serialize ,R会写一些整数,这些整数表示序列化期间的R版本.

According to the source code of base::serialize, R writes some integers which represent the R version during the serialization.

digest :: digest 在计算md5sum之前会跳过这些位,因此结果将保持一致.

digest::digest skips these bits before calculating md5sum, so the result will be consistent.

这篇关于R库中MD5哈希的区别-序列化对象的MD5的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆