我的哈希表需要进行同行评审 [英] My hash table is in need of peer review

查看:94
本文介绍了我的哈希表需要进行同行评审的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果有人花一两分钟来审核我的哈希表b / b
表,我将不胜感激。它还没有被评论,但希望它很短,而且很有可能是可读的。一些代码(即

get_hash函数)是从我在

网上找到的各种片段中借来的。你的自由功能可能需要一些爱。我一直在考虑拥有所有参赛作品的第二个链表,以便

的免费费用与参赛作品的数量成比例。


#include< stdlib.h>

#include< string.h>


#define TABLE_SIZE 193


struct entry {

char * key;

char * value;

struct entry * next;

};


struct hash {

struct entry * bucket [TABLE_SIZE];

};


static unsigned long get_hash(const char * key)

{

unsigned long hash = 0;


while(* key){

hash =(hash * 191 + * key)%TABLE_SIZE;

key ++;

}


返回哈希值;

}


char * hash_lookup(struct hash * table,const char * key)

{

unsigned long hash = get_hash(key);

struct entry * p = table-> bucket [hash];


而(p){

if(!strcmp(key,p-> key))

break;

p = p-> next;

}


if(!p)

返回NULL;


返回p-> value;

}


int hash_insert(struct hash * table,const char * key,char * value)

{

unsigned long hash = get_hash(key);

struct entry * p = table-> bucket [hash];


while(p){

if(!strcmp(key,p-> key))

break;

p = p-> next;

}


if(!p){

p = malloc(sizeof(* p));

if(b) !p)

返回-1;


p-> key = malloc(strlen(key)+ 1);

if(!p-> key){

free(p);

返回-1;

}


strcpy(p-> key,key);

p-> next = table-> bucket [hash];

table - > bucket [hash] = p;

}


p-> value = value;


返回0;

}


stru ct hash * hash_new()

{

struct hash * table = malloc(sizeof(* table));

if(!table)

返回NULL;


memset(table,0,sizeof(* table));


返回表;

}


void hash_free(struct hash * table)

{

struct entry * p,* tmp;


for(int i = 0;我< TABLE_SIZE; i ++){

p = table-> bucket [i];

while(p){

tmp = p;

p = p-> next;

免费(tmp-> key);

免费(tmp);

}

}


免费(表格);

}

解决方案

在文章< 11 ********************** @ m73g2000cwd.googlegroups .com>,

Johan Tibell< jo ********** @ gmail.comwrote:


hash =(hash * 191 + * key)% TABLE_SIZE;



我没有阅读你的其余代码,但我建议不要在哈希函数中构建

表大小。使用更通用的哈希函数

生成一个int大小的值,并将其减少到表大小(%)

仅在需要选择桶时。


除了让你的哈希函数更可重用之外,你可以在条目中存储(大)哈希值,并避免很多(很可能全部)

strcmp()s首先比较哈希值。


- Richard


< a href =mailto:ri ***** @ cogsci.ed.ac.uk> ri ***** @ cogsci.ed.ac.uk (Richard Tobin)写道:


> Johan Tibell< jo ********** @ gmail.comwrote:


> hash =(hash * 191 + * key)%TABLE_SIZE;



我没有看到你的其余代码,但我建议不要在哈希函数中构建

表大小。



如果存储桶:表大小增加,则调整数组大小的例程

太大或太小。将需要重新计算哈希值,除非你存储了
(在执行%TABLE_SIZE之前的哈希)。如果是这样,表格大小

这是2的幂可能会更好。


除了使你的哈希函数更可重用之外,你然后可以在条目中存储(大)哈希值,并通过首先比较哈希值来避免很多(很可能全部)

strcmp()。



好​​吧,成功的查找需要至少一个strcmp()。


可能使用最多32位输入哈希值的类型。 (来自< stdint.h>

与C99,或使用#include< limits.h / #if UINT_MAX> = 0xffffffff /

typedef ... to支持C90。)


另一个小黑客:

struct entry {

char * value;

struct entry * next;

char key []; / *在C99中,或者键[1]在C89中* /

};

现在分配足够大的条目来保存结构和

字符串。在C89方式中,编译器可以知道这些键只是

1字节宽,所以符合标准的访问''key''的方式是

(char *)< entry_ptr + offsetof(struct entry,密钥)


-

Hallvard


文章< hb ***** *********@bombur.uio.no>,

Hallvard B Furuseth< h。********** @ usit.uio.nowrote:


>嗯,成功的查找至少需要一个strcmp()。



好​​点。虽然有一个好的64位哈希,你可以轻松获得另一个线程中提到的

99.99%的可靠性:-)


- Richard


I would be grateful if someone had a minute or two to review my hash
table implementation. It''s not yet commented but hopefully it''s short
and idiomatic enough to be readable. Some of the code (i.e. the
get_hash function) is borrowed from various snippets I found on the
net. Thee free function could probably need some love. I have been
thinking about having a second linked list of all entries so that the
cost of freeing is in proportion to the number of entries.

#include <stdlib.h>
#include <string.h>

#define TABLE_SIZE 193

struct entry {
char *key;
char *value;
struct entry *next;
};

struct hash {
struct entry *bucket[TABLE_SIZE];
};

static unsigned long get_hash(const char *key)
{
unsigned long hash = 0;

while (*key) {
hash = (hash * 191 + *key) % TABLE_SIZE;
key++;
}

return hash;
}

char *hash_lookup(struct hash *table, const char *key)
{
unsigned long hash = get_hash(key);
struct entry *p = table->bucket[hash];

while (p) {
if (!strcmp(key, p->key))
break;
p = p->next;
}

if (!p)
return NULL;

return p->value;
}

int hash_insert(struct hash *table, const char *key, char *value)
{
unsigned long hash = get_hash(key);
struct entry *p = table->bucket[hash];

while (p) {
if (!strcmp(key, p->key))
break;
p = p->next;
}

if (!p) {
p = malloc(sizeof(*p));
if (!p)
return -1;

p->key = malloc(strlen(key) + 1);
if (!p->key) {
free(p);
return -1;
}

strcpy(p->key, key);
p->next = table->bucket[hash];
table->bucket[hash] = p;
}

p->value = value;

return 0;
}

struct hash *hash_new()
{
struct hash *table = malloc(sizeof(*table));
if (!table)
return NULL;

memset(table, 0, sizeof(*table));

return table;
}

void hash_free(struct hash *table)
{
struct entry *p, *tmp;

for (int i = 0; i < TABLE_SIZE; i++) {
p = table->bucket[i];
while (p) {
tmp = p;
p = p->next;
free(tmp->key);
free(tmp);
}
}

free(table);
}

解决方案

In article <11**********************@m73g2000cwd.googlegroups .com>,
Johan Tibell <jo**********@gmail.comwrote:

hash = (hash * 191 + *key) % TABLE_SIZE;

I haven''t read the rest of your code, but I recommend not building the
table size into the hash function. Use a more general hash function that
produces an int-sized value, and reduce it to the table size (with %)
only when needed to choose the bucket.

Apart from making your hash function more reusable, you can then store
the (large) hash value in the entry, and avoid many (quite likely all)
strcmp()s by comparing the hash values first.

-- Richard


ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:

>Johan Tibell <jo**********@gmail.comwrote:

> hash = (hash * 191 + *key) % TABLE_SIZE;


I haven''t read the rest of your code, but I recommend not building the
table size into the hash function.

Also a routine to resize the array if the buckets:table size grows
too large or too small. Will need to to recalculate the hash, unless
you store (the hash before doing % TABLE_SIZE). If so, a table size
which is a power of 2 might be better.

Apart from making your hash function more reusable, you can then store
the (large) hash value in the entry, and avoid many (quite likely all)
strcmp()s by comparing the hash values first.

Well, successful lookups will need at least one strcmp().

Might use a "max 32 bits" type for the hash value. (From <stdint.h>
with C99, or use #include <limits.h/ #if UINT_MAX >= 0xffffffff /
typedef ... to support C90.)

Another little hack:
struct entry {
char *value;
struct entry *next;
char key[]; /* in C99, or key[1] in C89 */
};
Now allocate entries large enough to hold both the struct and the
string. The C89 way, the compiler may "know" that keys are just
1 byte wide, so the standard-compliant way to access ''key'' is
(char *)<entry_ptr+ offsetof(struct entry, key)

--
Hallvard


In article <hb**************@bombur.uio.no>,
Hallvard B Furuseth <h.**********@usit.uio.nowrote:

>Well, successful lookups will need at least one strcmp().

Good point. Though with a good 64-bit hash you could easily get the
99.99% reliability mentioned in another thread :-)

-- Richard


这篇关于我的哈希表需要进行同行评审的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆