用密钥在python中哈希一个csv文件 [英] Hashing a csv file in python with a key

查看:198
本文介绍了用密钥在python中哈希一个csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有1000多个电子邮件地址的csv文件,我想使用SHA256 HMAC和一个共享密钥(编码为Base64)进行哈希处理.

I have a csv file with 1000+ emailadreses which I want to hash using a SHA256 HMAC and a shared key, encoded to Base64.

此处存在类似问题,但我无法适应为我工作的解决方案.我是python的新手,我不知道在哪里更改代码以利用共享密钥.

There was a similiar problem here, but I can't adapt the solution to work for me. I am new to python and I don't know where to change the code in order to make use of the shared key.

这是答案中稍作修改的代码:

This is the slightly adapted code from the answer:

import csv
import hashlib
import hmac
import base64

IN_PATH = 'test.csv'
OUT_PATH = 'test_hashed.csv'
ENCODING = 'utf8'
HASH_COLUMNS = dict(Mail='md5')


def main():
    with open(IN_PATH, 'rt', encoding=ENCODING, newline='') as in_file, \
            open(OUT_PATH, 'wt', encoding=ENCODING, newline='') as out_file:
        reader = csv.DictReader(in_file)
        writer = csv.DictWriter(out_file, reader.fieldnames)
        writer.writeheader()
        for row in reader:
            for column, method in HASH_COLUMNS.items():
                data = row[column].encode(ENCODING)
                digest = hashlib.new(method, data).hexdigest()
                row[column] = '0x' + digest.upper()
            writer.writerow(row)

if __name__ == '__main__':
    main()

输入文件(.csv)如下:

The input file (.csv) looks like this:

Mail
DHSKA@gmail.com
DJÖANw12@gmail.com
JSNÖS83@ymail.com
HDKDLSA@gmail.com
KKKDLAmS19@yamil.com

使用上面的代码,输出文件如下所示:

And with the code above, the output file looks like this:

0xB6A77B6EB853CC4CC8342B312293FA9C
0xEB439592D8EEC2A38A597350EF80E512
0x833EB6AEC1D03D7D8C94606E0D749B80
0x8007D8D1702E8A749EBD6033A52A7897
0x415E067487C4A5FBDB86AB0F855DB114

但是,由于我确实希望将HMAC与秘密密钥和sha256一起使用,因此上述解决方案对我不起作用,我也不知道如何结合这种方法.

But since I do want to use a HMAC with secret key and sha256, the above solution doesn't work for me and I don't know how to incorporate this approach.

关键是这样的:

123Abc

我试图做这样的事情,但是对于整个文件:

I was trying to do something like this, but for the whole file:

import hmac
import hashlib
import base64

secret = "123Abc"
secret_bytes = bytes(secret, 'latin-1')
data = "DHSKA@gmail.com"
data_bytes = bytes(data, 'latin-1')

digest = hmac.new(secret_bytes, msg=data_bytes, digestmod=hashlib.sha256).digest()
signature = base64.b64encode(digest).decode()

因此,我的问题是如何在上面的代码中使用密钥将HMAC SHA 256哈希散列合并到一起?我只是不知道要更改哪些参数?

Thus, my question is how I can incorporate the HMAC SHA 256 hashing wile using the a secret key, in the above code? I just can't figure out which parameters to change?

推荐答案

我认为您不必麻烦字典.您此处的列数没有变化,您正在将转换仅应用于一列.

I don't think you need to trouble yourself with the dictionary; you don't have a variable number of columns here, you are applying your transformation to just one column.

如果只是将可以使用的HMAC方法放入函数中,就会更容易理解:

It'll be easier to follow if you just put your working HMAC method into a function:

import hmac
import hashlib
import base64

secret = "123Abc"
secret_bytes = bytes(secret, 'latin-1')

def create_signature(email, secret_bytes):
    data_bytes = email.encode('latin-1')
    digest = hmac.new(secret_bytes, msg=data_bytes, digestmod=hashlib.sha256).digest()
    signature = base64.b64encode(digest).decode()
    return signature

这现在可以巧妙地从电子邮件地址和您的(编码的)机密中生成带有HMAC摘要的Base64字符串:

This now neatly produces a Base64 string with a HMAC digest from an email address and your (encoded) secret:

>>> create_signature('DHSKA@gmail.com', secret_bytes)
'3KaSw4QeA5l0rz49uutaDGemn4Et4CQnbnngm6mmpjE='

现在,您可以将其应用于'Mail'列值,并用结果写出新的CSV:

Now you can apply that to the 'Mail' column values, and write out the new CSV with the results:

with open(IN_PATH, 'rt', encoding=ENCODING, newline='') as in_file, \
        open(OUT_PATH, 'wt', encoding=ENCODING, newline='') as out_file:
    reader = csv.DictReader(in_file)
    writer = csv.DictWriter(out_file, reader.fieldnames)
    writer.writeheader()
    for row in reader:
        row['Mail'] = create_signature(row['Mail'], secret_bytes)
        writer.writerow(row)

演示:

>>> import sys, csv, io
>>> demo_input = io.StringIO('''\
... Mail
... DHSKA@gmail.com
... DJÖANw12@gmail.com
... JSNÖS83@ymail.com
... HDKDLSA@gmail.com
... KKKDLAmS19@yamil.com
... ''')
>>> demo_output = io.StringIO()
>>> with demo_input as in_file:
...     reader = csv.DictReader(in_file)
...     writer = csv.DictWriter(demo_output, reader.fieldnames)
...     writer.writeheader()
...     for row in reader:
...         row['Mail'] = create_signature(row['Mail'], secret_bytes)
...         writer.writerow(row)
...
46
46
46
46
46
>>> print(demo_output.getvalue())
Mail
3KaSw4QeA5l0rz49uutaDGemn4Et4CQnbnngm6mmpjE=
dP9IU66yKnYP/6mFRZ6TAAAN3lmxAcUPk9o1iFfpGDs=
ajNdCZF8ndw2SrgtSzcVCbeSpFsXI/Z6Ep0IC2fj+WU=
TgeFEj8CgvcQbVcLHTIIY1ULLnYkWAZaia5k01IQiJY=
Xu94abwV/5/HUXY+T3NpUgulGvew+L0UYzkPuRSv/98=

这篇关于用密钥在python中哈希一个csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆