在Python中哈希(隐藏)字符串 [英] Hashing (hiding) strings in Python

查看:153
本文介绍了在Python中哈希(隐藏)字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要的是对字符串进行散列。它不必是安全的,因为它只是在文本文件中隐藏的短语(它只是不必被人眼识别)。

What I need is to hash a string. It doesn't have to be secure because it's just going to be a hidden phrase in the text file (it just doesn't have to be recognizable for a human-eye).

它不应该只是一个随机字符串,因为当用户键入字符串时,我想将其哈希,并将其与已经散列的字符串进行比较(从文本文件) 。

It should not be just a random string because when the users types the string I would like to hash it and compare it with an already hashed one (from the text file).

为什么会是最好的?可以用内置的类来完成吗?

What would be the best for this purpose? Can it be done with the built-in classes?

推荐答案

首先,让我说你不能保证唯一的结果。如果您想要为宇宙中所有字符串提供唯一的结果,那么最好是存储字符串本身(或压缩版本)。

First off, let me say that you can't guarantee unique results. If you wanted unique results for all the strings in the universe, you're better off storing the string itself (or a compressed version).

。让我们先获得一些哈希值。

More on that in a second. Let's get some hashes first.

您可以使用任何主密码散列哈希字符串有几个步骤:

You can use any of the main cryptographic hashes to hash a string with a few steps:

>>> import hashlib
>>> sha = hashlib.sha1("I am a cat")
>>> sha.hexdigest()
'576f38148ae68c924070538b45a8ef0f73ed8710'

您可以选择SHA1,SHA224,

You have a choice between SHA1, SHA224, SHA256, SHA384, SHA512, and MD5 as far as built-ins are concerned.

哈希函数通过获取可变长度的数据并将其转换为固定长度的数据来工作。

A hash function works by taking data of variable length and turning it into data of fixed length.

固定长度,在每个内置于 hashlib 中的SHA算法是名称中指定的位数(除了sha1是160位)。如果你想要更好的确定两个字符串不会在同一个bucket(相同的哈希值)中,那么选择一个更大的摘要(固定长度)的哈希。

The fixed length, in the case of each of the SHA algorithms built into hashlib, is the number of bits specified in the name (with the exception of sha1 which is 160 bits). If you want better certainty that two strings won't end up in the same bucket (same hash value), pick a hash with a bigger digest (the fixed length).

按排序顺序,这些是您必须使用的摘要大小:

In sorted order, these are the digest sizes you have to work with:

Algorithm  Digest Size (in bits)
md5        128
sha1       160
sha224     224
sha256     256
sha384     384
sha512     512

如果你的哈希函数是值得的,那么消化越大越容易发生碰撞。

The bigger the digest the less likely you'll have a collision, provided your hash function is worth its salt.

内置 hash()函数返回整数,这也可以很容易地用于您概述的目的。有问题。

The built in hash() function returns integers, which could also be easy to use for the purpose you outline. There are problems though.

>>> hash('moo')
6387157653034356308




  1. 如果您的程序将在不同的系统上运行,则不能确定 hash 将返回相同的事情。事实上,我使用64位Python运行在64位盒子上。这些值与32位Python完全不同。

  1. If your program is going to run on different systems, you can't be sure that hash will return the same thing. In fact, I'm running on a 64-bit box using 64-bit Python. These values are going to be wildly different than for 32-bit Python.

对于Python 3.3+,作为 @ gnibbler 指出, hash()在运行之间是随机的。它将适用于单次运行,但几乎绝对不会在您的程序运行中运行(从您提到的文本文件中拉出)。

For Python 3.3+, as @gnibbler pointed out, hash() is randomized between runs. It will work for a single run, but almost definitely won't work across runs of your program (pulling from the text file you mentioned).

为什么要这样构建 hash()?嗯,内置哈希是有一个特定的原因。哈希表/词典/在内存中查找表。

Why would hash() be built that way? Well, the built in hash is there for one specific reason. Hash tables/dictionaries/look up tables in memory. Not for cryptographic use but for cheap lookups at runtime.

不要使用 hash(),而不是加密使用,使用 hashlib

Don't use hash(), use hashlib.

这篇关于在Python中哈希(隐藏)字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆