为HMAC准备字符串 [英] Preparing a string for HMAC

查看:117
本文介绍了为HMAC准备字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个使用HMAC进行消息身份验证的Web服务.我在为摘要准备数据"时遇到了一些问题,并且在Python和NodeJS中获得了针对同一数据"的不同摘要.

I am writing a webservice which uses HMAC for message authentication. I am having some issues preparing the 'data' for digest, and am getting different digests for the same 'data' in Python vs NodeJS.

我相当确定这个问题是由于编码引起的,但是我不确定如何最好地解决这个问题.

I am fairly sure that this issue is due to encoding, but I am not sure how to best approach this.

Python代码:

import hmac
from hashlib import sha1

f = open('../test.txt')
raw = f.read()

raw = raw.strip()

hm = hmac.new('12345', raw, sha1)
res = hm.hexdigest()
print res

>> 5bff447a0fb82f3e7572d9fde362494f1ee2c25b

NodeJS(咖啡)代码:

NodeJS (coffee) code:

fs = require 'fs'
http = require 'http'
{argv} = require 'optimist'
crypto = require 'crypto'

# Load the file
file = fs.readFileSync argv.file, 'utf-8'
file = file.trim()

# Create the signature
hash = crypto.createHmac('sha1', '12345').update(file).digest('hex')
console.log(hash)

>> a698f82ea8ff3c4e9ffe0670be2707c104d933aa

另外,raw的长度比文件长2个字符,但是我无法弄清楚这两个字符的来源.

Also, the length of raw is 2 characters longer than file, but I cant work out where these two characters come from.

推荐答案

这是对从文件系统读取的数据进行编码的问题,而对于使用的算法则没有任何帮助.

This is the problem with encoding of the data you read from the filesystem and has nothing with algorithms you use.

在Python和JavaScript中使用字符串数据时,应非常谨慎地编码存储数据的位置.尝试使用与字符串相同的数据(尤其是具有编码的属性) ,或与原始数据"一样.在读取和签名数据时,您可能不必在乎编码,而应尽可能多地使用语言来将数据用作原始".

When you work with string data both in Python and JavaScript, you should be very careful about encoding which your data is stored in. Try to work with data either as with strings (which, in particular have such a property as encoding), or as with "raw data". When reading and signing data, you shouldn't probably care about the encoding, and try to use data as "raw" as much as you can in your language.

一些注意事项:

  • 文件系统存储原始"字节,对文件的内容和编码一无所知.此外,对于某些文件(例如jpeg),编码"概念毫无用处
  • 对于加密算法同样有效.它们使用原始字节,并且对其字符表示形式"一无所知.这就是为什么数字签名在各种二进制文件等中都能如此出色地工作的原因.
  • javascript中的
  • trim()或python中的strip()使用字符串,它们的行为可能会有所不同,具体取决于基础编码(例如,尝试在python中使用u's '.encode('utf-16').strip().decode('utf-16')).如果可能的话,我宁愿避免使用修整功能,以免混淆您处理数据的方式.
  • Python 2.x(我想也是Javascript)具有一组在字符串和原始数据之间进行隐式转换的规则.
  • Filesystem stores "raw" bytes, and knows nothing about the contents and the encoding of your file. Furthermore, for some files (like, jpegs, for example), the "encoding" concept is worthless
  • The same is valid for crypto algorithms. They work with raw bytes and know nothing about its "character representation". That's why digital signatures work so well with all sorts of binary documents, etc.
  • trim() in javascript or strip() in python work with strings, and their behaviour can vary depending on the underlying encoding (try u's '.encode('utf-16').strip().decode('utf-16') in python, for example). If possible, I'd rather avoid using trimming, to not to mix the way you work with data.
  • Python 2.x (and, I suppose, Javascript too) have set of rules for implicit conversion between strings and raw data.

在代码中,您可以使用Python使用二进制数据,但是在定义要读取的文件的编码时,可以将其转换为JavaScript中的字符串.显然,在加密模块中存在一种从utf-8隐式转换为原始字符串的隐式转换,但我不知道它的作用.

Here in your code you work with binary data in Python, but do conversion to string in JavaScript, when you define the encoding of the file to read. Apparently, there is a sort of implicit converting from utf-8 back to raw string in crypto module, but I don't know what it does.

此处所述,这是在节点中处理原始字符串的最简洁的方式. js是使用缓冲区.您可以从文件系统读取缓冲区,但是不幸的是,nodejs加密库尚不支持它们.如此处:

As described in here, the most kosher way of handing raw strings in node.js is to use buffers. You could read buffer from filesystem, but unfortunately, nodejs crypto library doesn't support them yet. As described here:

在出现加密"模块的概念之前,已将加密模块添加到节点 统一的Stream API,并且在存在用于处理的Buffer对象之前 二进制数据.

The Crypto module was added to Node before there was the concept of a unified Stream API, and before there were Buffer objects for handling binary data.

因此,流类没有典型的方法 其他Node类,许多方法接受并返回二进制编码 默认为字符串,而不是缓冲区.

As such, the streaming classes don't have the typical methods found on other Node classes, and many methods accept and return Binary-encoded strings by default rather than Buffers.

也就是说,为了使示例正常工作,当前的方法是通过将"binary"作为第二个参数传递给调用来读取数据:

That's said, to make the example work, current approach is to read data by passing "binary" as the second argument to the call:

file = fs.readFileSync argv.file, "binary"

而且,正如我所说,我宁愿避免剥离刚从文件中读取的数据.

Also, as I said, I'd rather avoid stripping data I just read from the file.

这篇关于为HMAC准备字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆