最有效的方法来匹配C(几)字符串? [英] Most efficient way to match (a few) strings in C?
问题描述
我们的系统需要从对一些已知关键字字符串(也许10)。
终端和匹配接受用户输入我们没有空间/ computrons做正则表达式等,code必须是微小和放大器;快。
现在,讨厌的方式做到这一点是:
// str是空值终止的,假设我们知道它的安全/理智这里
如果(!STRNCMP(STR,你好,5)
{
do_hello();
}
否则,如果(!STRNCMP(STR,世界,5)
{
do_world();
}
其他
{
MEH(); //不是比赛
}
所以,有点谷歌搜索与放大器之后;阅读我是相信一个更好的办法是pre-计算各种比赛的哈希值作为一个int,然后只用一个case语句:
//假设散列()停在空
开关(哈希(STR))
{
案例HASH_OF_HELLO:
do_hello();
打破; 案例HASH_OF_WORLD:
do_world();
打破; 默认:
MEH();
打破;
}
我们可以计算出* HASH_OF_match *在编译时。这可能似乎更快/更优雅的方式来接从一个相对较小的集的字符串。
所以 - 这是否似乎是合理的? /有有这样一个突出的问题? /任何人有这样做的更优雅的方式?
作为一个注脚,这是最好的寻找哈希算法我已经看到了今天下午。),贷记丹伯恩斯坦,它看起来到手头的工作。
无符号整型
get_hash(为const char * S)
{
unsigned int类型的哈希= 0;
INT℃; 而((C = * S +))
{
//哈希散列= * 33 ^ C
哈希=((哈希<< 5)+散)^ C;
} 返回哈希;
}
通过散列的问题是,用户输入的任意字符串可以产生相同的散列作为你的匹配的之一,你'会执行错误的东西。对于一个搜索设置为小到10我只是坚持到的if-else
办法。或者用一个字符串数组和函数指针数组(假设所有的功能具有相同的签名),选择要执行的功能。
字符常量*匹配[10] = {第一,第二,......,第十届};
空隙(* FN [10])(无效)= {&放大器; do_first,&放大器; do_second,...,放大器; do_tenth};对于(I = 0; I&小于10 ++ⅰ){
如果(STRCMP(STR,火柴[I])== 0){
(* FN [I])();
}
}
Our system needs to accept user input from a terminal and match against a few known keyword strings (maybe 10).
We don't have the space/computrons to do regexp etc., code needs to be tiny & quick.
Now, the nasty way to do this is:
// str is null-terminated, assume we know it's safe/sane here
if(!strncmp(str,"hello",5)
{
do_hello();
}
else if(!strncmp(str,"world",5)
{
do_world();
}
else
{
meh(); // Wasn't a match
}
So, after a bit of googling & reading I'm being convinced that a nicer way is to pre-compute the hash of the various matches as an int, and then just use a case statement:
// Assume hash() stops at NULL
switch(hash(str))
{
case HASH_OF_HELLO:
do_hello();
break;
case HASH_OF_WORLD:
do_world();
break;
default:
meh();
break;
}
We can compute the *HASH_OF_match* at compile time. This seems potentially a faster / more elegant way to pick a string from a relatively small set.
So - does this seem reasonable? / Is there a glaring problem with doing this? / Anyone got a more elegant way of doing it?
As a footnote, this is the nicest looking hash algorithm I've seen this afternoon ;), credited to dan bernstein, it looks up to the job at hand.
unsigned int
get_hash(const char* s)
{
unsigned int hash = 0;
int c;
while((c = *s++))
{
// hash = hash * 33 ^ c
hash = ((hash << 5) + hash) ^ c;
}
return hash;
}
The problem with hashing is that an arbitrary string entered by the user may generate the same hash as one of your matches and you'll execute the wrong stuff. For a search set as small as 10 I'd just stick to the if-else
approach. Or use a string array and function pointer array (assuming all functions have the same signature) to select the function to execute.
char const *matches[10] = {"first", "second", ..., "tenth"};
void (*fn[10])(void) = {&do_first, &do_second, ..., &do_tenth};
for( i = 0; i < 10; ++i ) {
if( strcmp( str, matches[i] ) == 0 ) {
(*fn[i])();
}
}
这篇关于最有效的方法来匹配C(几)字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!