std :: map性能 [英] std::map performance
问题描述
我是一名毕业于C ++的C程序员。作为练习,我写了一个
程序来计算
文本文件中出现不同单词的次数。虽然哈希表可能是更好的选择,但是我
选择使用std :: map。但是,程序的运行速度不到使用普通手工编码的二进制
树的等效程序的一半。如果我用我自己的简单字符串类替换std :: string作为
键,结果不会有太大变化(我想我不能使用
char *因为标准容器的价值语义)。我使用GNU libstdc ++ 3和gcc 3.3使用-03(所有优化)
标志
。这是我的程序,我的C ++实现的问题,还是
只是使用更通用类的必要开销?
程序的结构是如下:
std :: map< std :: string,long>单词;
main()
{
char buf [maxbuf];
while(/ * not end输入* /)
{
//下一个单词进入buf
insert(buf);
}
//打印出来的话
}
void insert(char * s)
{
long& l = words [s];
if(l< numeric_limits< long> :: max())
++ l;
}
jmoy写道:while(/ * Not输入结束* /)
可能性是这是你的瓶颈。你是怎么读这个文件的?
雅克。
" jmoy" < NA ****** @ yahoo.co.in>在消息中写道
新闻:8d ************************* @ posting.google.co m ... < blockquote class =post_quotes>我是一名C程序员,毕业于C ++。作为练习,我写了一个
程序来计算
文本文件中出现不同单词的次数。虽然哈希表可能是更好的选择,但我选择使用std :: map。但是,该程序的运行速度不到使用普通手工编码二进制树的等效程序的一半。如果我用自己的简单字符串类替换std :: string作为键,我的结果没有太大变化(我想我不能使用
char *因为标准容器的值语义)。我使用GNU libstdc ++ 3和gcc 3.3和-03(所有优化)
标志。这是我的程序,我的C ++实现的问题,还是使用更通用的类的必要开销?
你的程序对我来说很好看,我不能评论你的实现。
我希望手工编码的二叉树比std :: map提供更快
数据是无序的。在一个大文件上尝试手工编码的二叉树,
已按字母顺序排列,我想你会看到一个很大的性能
命中(假设你的二叉树代码按字母顺序排列)。在其他
字样中,您为在二叉树上使用std :: map付出了代价,但是
std :: map保证了最坏的情况下的性能。
哈希表会更好,它很快就会出现在标准C ++中!
john
[snip]
void insert(char * s)
{
long& ; l =单词[s];
if(l< numeric_limits< long> :: max())
++ l;
}
>
在我使用[] -operator和地图的经验中,如果你想要表现,那就不是很好的事情。
我建议尝试两种方法一个带有find()和条件insert()
的版本以及带有std :: pair< iterator,bool>的版本insert()方法。
如果您尝试,请发布结果。
干杯
Max
I am a C programmer graduating to C++. As an exercise I wrote a
program to count the number of times that different words occur in a
text file. Though a hash table might have been a better choice, I
chose to use std::map. However, the program runs at less than half the
speed of an equivalent program that uses ordinary handcoded binary
trees. The result is not changed very much if I replace std::string as
the key with a simple string class of my own (I guess I cannot use
char* because of the value semantics of the standard containers). I am
using GNU libstdc++3 and gcc 3.3 with the -03 (all optimizations)
flag. Is this a problem with my program, my C++ implementation, or
just the necessary overhead of using a more general class?
The structure of the program is as follows:
std::map<std::string,long> words;
main()
{
char buf[maxbuf];
while (/*Not end of input*/)
{
//Get next word into buf
insert(buf);
}
//Print out the words
}
void insert(char *s)
{
long &l=words[s];
if (l<numeric_limits<long>::max())
++l;
}
jmoy wrote:while (/*Not end of input*/)
Odds are that this is your bottleneck. How are you reading the file?
Jacques.
"jmoy" <na******@yahoo.co.in> wrote in message
news:8d*************************@posting.google.co m...I am a C programmer graduating to C++. As an exercise I wrote a
program to count the number of times that different words occur in a
text file. Though a hash table might have been a better choice, I
chose to use std::map. However, the program runs at less than half the
speed of an equivalent program that uses ordinary handcoded binary
trees. The result is not changed very much if I replace std::string as
the key with a simple string class of my own (I guess I cannot use
char* because of the value semantics of the standard containers). I am
using GNU libstdc++3 and gcc 3.3 with the -03 (all optimizations)
flag. Is this a problem with my program, my C++ implementation, or
just the necessary overhead of using a more general class?
Your program looks fine to me, I can''t comment on your implementation.
I would expect a hand coded binary tree to be faster than std::map provided
the data is unordered. Try your hand coded binary tree on a large file that
is already in alphabetical order and I think you''ll see a big performance
hit (assuming your binary tree code is alphabetically ordered). In other
words you are paying a price for using std::map over a binary tree but
std::map guarantees better worst case performance.
A hash table would have been better, it''s coming to standard C++ soon!
john
Hi,
[snip]
void insert(char *s)
{
long &l=words[s];
if (l<numeric_limits<long>::max())
++l;
}
in my experience using the []-operator with maps isn''t a very good thing if
you want performance.
I would suggest to try both a version with find() and conditional insert()
afterwards and a version with the std::pair<iterator,bool> insert() method.
If you try it, please post results.
Cheers
Max
这篇关于std :: map性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!