R库中MD5哈希的区别-序列化对象的MD5 [英] Difference of MD5 Hash in R-Librarys - MD5 for serialized objects
问题描述
我想为R对象计算MD5哈希值.这通常是通过序列化的对象完成的.我知道可以计算MD5哈希值的两个differect R库-摘要库和openssl库.但是,这两个返回不同的哈希值.这是openssl库的示例:
I want to calculate a MD5 Hash for an R Object. This is usually done with the serialized object. I am aware of two differect R libs that can calculate MD5 hashes - the digest library and the openssl library. But these two return different hash values. Here is an example fore the openssl library:
test <- 1:100
library(openssl )
md5(serialize(test, connection = NULL))
# returns: md5 23:a8:b3:40:9e:08:a0:3d:30:6e:3d:3d:cb:fe:21:57
现在是摘要库的示例:
library(digest)
digest(test,"md5",serialize = T)
# returns: [1] "83777773fa047247723ad5a255963144"
为什么这些哈希值不同?
推荐答案
简短答案
如果对象已序列化, 摘要
会跳过一些前导位.
Short answer
digest
skips some leading bits if the object is serialized.
例如:
> .t <- serialize(test, connection = NULL)
> md5(.t[seq(15, length(.t))])
md5 83:77:77:73:fa:04:72:47:72:3a:d5:a2:55:96:31:44
长答案
如果R版本不同,则 serialize(1:100,connection = NULL)
的结果将不同.
根据 base :: serialize
,R会写一些整数,这些整数表示序列化期间的R版本.
According to the source code of base::serialize
, R writes some integers which represent the R version during the serialization.
digest :: digest
在计算md5sum之前会跳过这些位,因此结果将保持一致.
digest::digest
skips these bits before calculating md5sum, so the result will be consistent.
这篇关于R库中MD5哈希的区别-序列化对象的MD5的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!