MySQL:UNIQUE文本字段使用附加的HASH字段 [英] MySQL: UNIQUE text field using additional HASH field

查看:1499
本文介绍了MySQL:UNIQUE文本字段使用附加的HASH字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的MySQL数据库中,我有一个表定义如下:

In my MySQL DB I have a table defined like:

CREATE TABLE `mytablex_cs` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `tag` varchar(6) COLLATE utf8_bin NOT NULL DEFAULT '',
  `value` text COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `kt` (`tag`),
  KEY `kv` (`value`(200))
) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=utf8 COLLATE=utf8_bin



我需要在值上实现UNIQUE约束

I need to implement a UNIQUE constraint (key) on the value field.

我知道还不可能为 blob

I know that is not yet possible to define a unique index on the entire value for a blob or text field, but there is a ticket(?) open to implement such feature (see this page) where it has been suggested to create a unique key using a hash like it is already implemented for other fields.

现在,我想使用类似的方法向表中添加另一个字段,该字段将包含散列并在此字段上创建唯一键。

Now I would like to use a similar approach adding to the table another field that will contain the hash and creating a unique key on this field.

我查看了创建这个哈希的可能方法,因为我想避免冲突(我需要插入几百万条记录),似乎RIPEMD-160算法是最好的,即使快速搜索给了我几个类似的解决方案,使用SHA256甚至SHA1和MD5。

I gave a look to possible ways to create this hash and, since I would like to avoid collisions (I need to insert several millions of entries), it seems that the RIPEMD-160 algorithm is the best one, even if a quick search gave me several similar solutions that use SHA256 or even SHA1 and MD5.

我完全缺乏密码学的知识,所以选择的缺点是什么这种方法?

I totally lack of knowledge in cryptography, so what are the down sides of choosing this approach?

另一个问题是:MySQL目前使用哪种算法创建哈希?

Another question I have is: which algorithm is currently used by MySQL to create the hash?

推荐答案

让我们看看你的要求:

你需要确保一个值字段是唯一的。值字段是一个文本列,由于它的性质,没有办法在值字段(现在)上创建唯一索引。因此,使用作为字段值的哈希的额外字段是您在这里唯一真正的选择。

You need to ensure that a value field is unique. The value field is a text column and due to the nature of it there is no way to create a unique index on the value field(for now). So using a extra field which is the hash of the field value is your only real option here.

此方法的优点:


  1. 很容易计算哈希值。

  2. 为两个不同的值创建重复哈希是非常罕见的,所以你的哈希值几乎是gauranteed是unqiue。

  3. 散列通常是可以有效编入索引的数字值(表示为十六进制)。

  4. 散列不占用大量空间,散列函数返回不同长度的散列,所以使用不同的算法,并测试他们找到一个适合你的需要。

  1. Easy to calculate the hash.
  2. It is extremely rare to create a duplicate hash for two different values so your hash values are almost gauranteed to be unqiue.
  3. Hashes are normally some numeric value(expressed as hexdecimal) that can be efficiently indexed.
  4. The hashes wont take up a lot of space, different hashing function return different length hashes so play around with the different algorithms and test them to find one that suits your need.

方法:


  1. 在INSERTS和UPDATES期间满足的额外字段,即有更多的工作要做。

  2. 如果你已经在表中有数据,这是在生产中,你必须更新当前的数据,希望你没有重复。此外,运行更新还需要一些时间。

  3. 散列函数是CPU密集型的,可能会对CPU使用产生负面影响。

我假设你明白一个哈希函数是什么,并且在概念上如何工作。

I assume you understand what a hash function does and conceptually how it works.

你可以找到一个列表加密功能: http://dev.mysql.com/ doc / refman / 5.5 / en // encryption-functions.html

You can find a list of cryptographic functions here: http://dev.mysql.com/doc/refman/5.5/en//encryption-functions.html

MySQL支持我知道的MD5,SHA,SHA1和SHA2哈希函数。大多数如果不是所有这些都应该足以进行散列。一些功能,如MD5在加密应用中使用时有一些问题,即在PKI中用作签名算法等。但是,当您决定使用它创建一个唯一的值,因为它不真正被应用时,这些问题不应该那么重要在这里的密码术语中。

MySQL supports as far as I know MD5, SHA, SHA1 and SHA2 hashing functions. Most if not all of these should be sufficient for just hashing. Some functions like MD5 has some issues when used in cryptography applications i.e. when using it in PKI as a signature algorithm etc. However these issues should not be that important when you decide on using it to create a unique value as it is not really being applied in a cryptography context here.

要使用MySQL哈希函数,您可以尝试以下示例:

To use the MySQL hashing functions you can try the following examples:

SELECT MD5('1234')
SELECT SHA('1234')
SELECT SHA1('1234')
SELECT SHA2('1234',224);

和everythig一样,你应该尝试所有的方法,找到一个最成功的

As with everythig new you should try all the approaches and find the one that will be most successfull in your case.

这篇关于MySQL:UNIQUE文本字段使用附加的HASH字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆