基准 [英] Benchmarks
问题描述
任务:编写一个程序,从标准的
输入读取一组单词并打印不同单词的数量。
我遇到了一个列出一些程序来完成此任务的网站
任务: http ://unthought.net/c++/c_vs_c++.html (忽略所有语言
flaming :-),并认为所有这些都做了不必要的操作,
所以我写了自己的。但出于某种原因,我的版本比网站中的所有版本都要慢得多,尽管看起来好像
执行的操作更少(是的,我对它们进行了基准测试)我自己的电脑)。
根据网站,最慢的版本是:
#include< set>
#include< string>
#include< iostream>
int main(int argc,char ** argv)
{
//声明并初始化一些变量
std :: string word;
std :: set< std :: stringwordcount;
//读取单词并在rb-tree中插入
while(std :: cin> word)wordcount.insert(word);
//打印结果
std :: cout<< 单词: << wordcount.size()<< std :: endl;
返回0;
}
我的版本大约慢12倍。它使用较低级别的
结构。这是:
#include< stdio.h>
#include< stdlib.h>
#include < string.h>
#include< ctype.h>
struct SetNode
{
char * word;
struct SetNode * next;
};
//一组无序的单词< br $>
//
静态结构SetNode * gSet = 0;
static int gSetSize = 0;
#define kInitWordSize 32
//返回从标准输入读取的单词。返回的指针必须是
//用free()解除分配。
//
静态字符*
ReadOneWord(无效)
{
int ch = getchar();
while(ch!= EOF&&如果(ch == EOF)
返回0;
br />
char * word =(char *)malloc(kInitWordSize);
if(!word)
返回0;
int size = kInitWordSize;
int i = 0;
while(ch!= EOF&&!isspace(ch) )){
if(i> = size){
size * = 2;
char * newWord =( char *)realloc(单词,大小);
if(!newWord){
free(word);
return 0;
}
word = newWord;
}
word [i ++] = ch;
ch = getchar();
}
if(i> = size){
size * = 2;
char * newWord =(char *)realloc(word,size);
if (!newWord){
免费(字);
返回0;
}
word = newWord;
}
word [i] =''\ 0'';
返回单词;
}
//如果单词不在集合中,则在单词中插入一个单词。
//预期传递的字符串分配了
//一个内存分配函数,在传递给这个函数后应该被认为是
//丢失。
/ /
static void
InsertWord(char * aWord)
{
struct SetNode * node;
for(node = gSet;节点; node = node-> next){
if(strcmp(node-> word,aWord)== 0){
free(aWord);
返回;
}
}
node =(struct SetNode *)malloc(sizeof(struct SetNode) ));
如果(!节点){
免费(aWord);
返回;
}
node-> word = aWord;
node-> next = gSet;
gSet = node;
++ gSetSize;
}
静态无效
DeleteSet(无效)
$
struct SetNode * node = gSet;
struct SetNode * temp;
while(node){
temp = node;
node = node-> next;
free(temp-> word);
免费(临时);
}
gSet = 0;
gSetSize = 0;
}
int
main(无效)
{
char * word;
word = ReadOneWord()))
InsertWord(word);
:%d \ n",gSetSize);
//暂时清理清理...
// DeleteSet();
}
关于导致大幅放缓的原因有什么想法吗?
塞巴斯蒂安
s0****@gmail.com 写道:
< blockquote class =post_quotes>
任务:编写一个程序,从标准
输入读取一组单词并打印不同单词的数量。
我遇到了一个网站,列出了一些程序来完成这个
任务: http://unthought.net/c++/c_vs_c++.html (忽略所有语言
flaming :-),并认为所有这些都没有必要操作,
所以我写了自己的。但出于某种原因,我的版本比网站中的所有版本都要慢得多,尽管看起来好像
执行的操作更少(是的,我对它们进行了基准测试)我自己的电脑)。
根据网站,最慢的版本是:
#include< set>
#include< string>
#include< iostream>
int main(int argc,char ** argv)
{
//声明并初始化一些变量
std :: string word;
std :: set< std :: stringwordcount;
//读取单词并在rb-tree中插入
while(std :: cin> word)wordcount.insert(word);
//打印结果
std :: cout<< 单词: << wordcount.size()<< std :: endl;
返回0;
}
我的版本大约慢12倍。它使用较低级别的
结构。这是:
#include< stdio.h>
#include< stdlib.h>
#include < string.h>
#include< ctype.h>
struct SetNode
{
char * word;
struct SetNode * next;
};
这是一个线性列表。
//一组无序的单词< br $>
//
静态结构SetNode * gSet = 0;
static int gSetSize = 0;
#define kInitWordSize 32
//返回从标准输入读取的单词。返回的指针必须是
//用free()解除分配。
//
静态字符*
ReadOneWord(无效)
{
int ch = getchar();
while(ch!= EOF&&如果(ch == EOF)
返回0;
br />
char * word =(char *)malloc(kInitWordSize);
if(!word)
返回0;
int size = kInitWordSize;
int i = 0;
while(ch!= EOF&&!isspace(ch) )){
if(i> = size){
size * = 2;
char * newWord =( char *)realloc(单词,大小);
if(!newWord){
free(word);
return 0;
}
word = newWord;
}
word [i ++] = ch;
ch = getchar();
}
if(i> = size){
size * = 2;
char * newWord =(char *)realloc(word,size);
if(!newWord){
free(单词);
返回0;
}
word = newWord;
}
word [i] =''\ 0'';
返回单词;
}
//如果它不在集合中,则在单词中插入一个单词。
//预期传递的字符串已经分配了
// a内存分配函数,传递给这个函数后应该考虑
//丢失。
//
static void
InsertWord(char * aWord)
{
struct SetNode * node;
for(node = gSet;节点; node = node-> next){
if(strcmp(node-> word,aWord)== 0){
free(aWord);
返回;
}
}
在这里,您进行线性搜索。
std :: set<在内部维护一个(平衡的)树,因此每个单词的比较(对数与线性)相比更少(
)。
node =(struct SetNode *)malloc(sizeof(struct SetNode));
if(!node){
免费(aWord);
返回;
}
node-> word = aWord;
node-> next = gSet;
gSet = node;
++ gSetSize;
}
static void
DeleteSet(void)
{
struct SetNode * node = gSet;
struct SetNode * temp;
while(节点){
temp = node;
node = node-> next;
free(temp-> word);
free(temp);
}
gSet = 0;
gSetSize = 0;
}
int
main(无效)
{
char * word;
while((word = ReadOneWord() ))
InsertWord(word);
printf(" Words:%d \ n",gSetSize);
>
//暂时清理清理......
// DeleteSet();
}
任意关于导致大幅放缓的原因的想法?
选择次优数据结构。
最佳
Kai-Uwe Bux
2008年11月6日星期四07:53:18 -0800,s0suk3写道:
任务:编写一个程序,从标准输入中读取一组单词
并打印不同单词的数量。
我遇到了一个列出一些单词的网站完成此任务的程序
任务: http:// unthought。 net / c ++ / c_vs_c ++。html (忽略所有语言
flaming :-),并认为所有这些都做了不必要的操作,所以
我写的我自己的。但出于某种原因,我的版本变得更慢了,即使它似乎执行更少的操作(是的,我对它们进行了基准测试)我自己的电脑)。
根据网站,最慢的版本是:
#include< set>
#include< string>
#include< iostream>
int main(int argc,char ** argv)
{
//声明并初始化一些变量std :: string word;
std :: set< std :: stringwordcount;
//读取单词并在rb-tree中插入
while(std :: cin> word)wordcount.insert(word); //打印
结果
std :: cout<< 单词: << wordcount.size()<<的std :: ENDL;返回
0;
}
我的版本慢了大约12倍。它使用较低级别的
结构。这是:
[snip]
所以,因为你的版本使用低级构造你认为
它会自动更快吗?
-
OU
记住18日2008年6月,民主当天下午去世。
http://frapedia.se/wiki/ Information_in_English
s0 **** @ gmail.com 写道:
任务:编写一个程序,从标准
输入读取一组单词并打印不同单词的数量。
我遇到一个网站列出了一些程序来完成这个
任务: http://unthought.net/c++/c_vs_c++.html (忽略所有语言
flaming :-),并且认为所有这些都做了不必要的操作,
所以我写了自己的。但出于某种原因,我的版本比网站中的所有版本都要慢得多,尽管看起来好像
执行的操作更少(是的,我对它们进行了基准测试)我自己的电脑)。
根据网站,最慢的版本是:
#include< set>
#include< string>
#include< iostream>
int main(int argc,char ** argv)
{
//声明并初始化一些变量
std :: string word;
std :: set< std :: stringwordcount;
//读取单词并在rb-tree中插入
while(std :: cin> word)wordcount.insert(word);
//打印结果
std :: cout<< 单词: << wordcount.size()<< std :: endl;
返回0;
}
我的版本大约慢12倍。它使用较低级别的
结构。这是:
[snip]
//如果不是,则在单词中插入一个单词在集合中。
//传递的字符串预计已经分配了
//一个内存分配函数,它应该被认为是
//传递给此函数后丢失。
//
static void
InsertWord(char * aWord)
{
struct SetNode * node;
for(node = gSet; node; node = node-> next){
if(strcmp(node-> word,aWord)== 0){
free(aWord);
return;
}
}
您将单词组表示为链表。您将每个
新单词与集合中已有的每个单词进行比较。 C ++解决方案使用了一个
std :: set,如果我没记错的话,可以在O(n log n)中进行搜索和插入
。
如果你重写它以使用平衡二叉树,例如AVL
树,你应该得到类似于C ++版本的性能。
node =(struct SetNode *)malloc(sizeof(struct SetNode));
不正确,但
node = malloc(sizeof * node);
会更好。
if(!node){
free(aWord);
return;
}
如果malloc失败,你悄悄地返回而不做任何事情
处理错误或向用户报告。
[...]
-
Keith Thompson(The_Other_Keith) ks *** @ mib.org < http://www.ghoti.net/~kst>
诺基亚
我们必须做点什么。这是事情。因此,我们必须这样做。
- Antony Jay和Jonathan Lynn,是部长
The task: Write a program that reads a set of words from standard
input and prints the number of distinct words.
I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations,
so I wrote my own. But for some reason, my version turned out slower
that ALL of the versions in the website, even though it seems to
perform less operations (yes, I benchmarked them on my own computer).
According to the website, the slowest version is:
#include <set>
#include <string>
#include <iostream>
int main(int argc, char **argv)
{
// Declare and Initialize some variables
std::string word;
std::set<std::stringwordcount;
// Read words and insert in rb-tree
while (std::cin >word) wordcount.insert(word);
// Print the result
std::cout << "Words: " << wordcount.size() << std::endl;
return 0;
}
My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
struct SetNode
{
char *word;
struct SetNode *next;
};
// An unorderd set of words
//
static struct SetNode *gSet = 0;
static int gSetSize = 0;
#define kInitWordSize 32
// Returns a word read from stdin. The returned pointer must be
// deallocated with free().
//
static char *
ReadOneWord(void)
{
int ch = getchar();
while (ch != EOF && isspace(ch))
ch = getchar();
if (ch == EOF)
return 0;
char *word = (char *) malloc(kInitWordSize);
if (!word)
return 0;
int size = kInitWordSize;
int i = 0;
while (ch != EOF && !isspace(ch)) {
if (i >= size) {
size *= 2;
char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}
word[i++] = ch;
ch = getchar();
}
if (i >= size) {
size *= 2;
char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}
word[i] = ''\0'';
return word;
}
// Inserts a word into the set if it isn''t in the set.
// The passed string is expected to have been allocated with
// a memory allocation function, and it should be considered
// lost after passed to this function.
//
static void
InsertWord(char *aWord)
{
struct SetNode *node;
for (node = gSet; node; node = node->next) {
if (strcmp(node->word, aWord) == 0) {
free(aWord);
return;
}
}
node = (struct SetNode *) malloc(sizeof(struct SetNode));
if (!node) {
free(aWord);
return;
}
node->word = aWord;
node->next = gSet;
gSet = node;
++gSetSize;
}
static void
DeleteSet(void)
{
struct SetNode *node = gSet;
struct SetNode *temp;
while (node) {
temp = node;
node = node->next;
free(temp->word);
free(temp);
}
gSet = 0;
gSetSize = 0;
}
int
main(void)
{
char *word;
while ((word = ReadOneWord()))
InsertWord(word);
printf("Words: %d\n", gSetSize);
// Skip cleanup for now...
//DeleteSet();
}
Any ideas as to what causes the big slowdown?
Sebastian
s0****@gmail.com wrote:
The task: Write a program that reads a set of words from standard
input and prints the number of distinct words.
I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations,
so I wrote my own. But for some reason, my version turned out slower
that ALL of the versions in the website, even though it seems to
perform less operations (yes, I benchmarked them on my own computer).
According to the website, the slowest version is:
#include <set>
#include <string>
#include <iostream>
int main(int argc, char **argv)
{
// Declare and Initialize some variables
std::string word;
std::set<std::stringwordcount;
// Read words and insert in rb-tree
while (std::cin >word) wordcount.insert(word);
// Print the result
std::cout << "Words: " << wordcount.size() << std::endl;
return 0;
}
My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
struct SetNode
{
char *word;
struct SetNode *next;
};This is a linear list.
// An unorderd set of words
//
static struct SetNode *gSet = 0;
static int gSetSize = 0;
#define kInitWordSize 32
// Returns a word read from stdin. The returned pointer must be
// deallocated with free().
//
static char *
ReadOneWord(void)
{
int ch = getchar();
while (ch != EOF && isspace(ch))
ch = getchar();
if (ch == EOF)
return 0;
char *word = (char *) malloc(kInitWordSize);
if (!word)
return 0;
int size = kInitWordSize;
int i = 0;
while (ch != EOF && !isspace(ch)) {
if (i >= size) {
size *= 2;
char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}
word[i++] = ch;
ch = getchar();
}
if (i >= size) {
size *= 2;
char *newWord = (char *) realloc(word, size);
if (!newWord) {
free(word);
return 0;
}
word = newWord;
}
word[i] = ''\0'';
return word;
}
// Inserts a word into the set if it isn''t in the set.
// The passed string is expected to have been allocated with
// a memory allocation function, and it should be considered
// lost after passed to this function.
//
static void
InsertWord(char *aWord)
{
struct SetNode *node;
for (node = gSet; node; node = node->next) {
if (strcmp(node->word, aWord) == 0) {
free(aWord);
return;
}
}Here, you do a linear search.
std::set<maintains a (balanced) tree internally and therefore does fewer
comparisons per word (logarithmic vs. linear).
node = (struct SetNode *) malloc(sizeof(struct SetNode));
if (!node) {
free(aWord);
return;
}
node->word = aWord;
node->next = gSet;
gSet = node;
++gSetSize;
}
static void
DeleteSet(void)
{
struct SetNode *node = gSet;
struct SetNode *temp;
while (node) {
temp = node;
node = node->next;
free(temp->word);
free(temp);
}
gSet = 0;
gSetSize = 0;
}
int
main(void)
{
char *word;
while ((word = ReadOneWord()))
InsertWord(word);
printf("Words: %d\n", gSetSize);
// Skip cleanup for now...
//DeleteSet();
}
Any ideas as to what causes the big slowdown?Choice of a sub-optimal data structure.
Best
Kai-Uwe Bux
On Thu, 06 Nov 2008 07:53:18 -0800, s0suk3 wrote:
The task: Write a program that reads a set of words from standard input
and prints the number of distinct words.
I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations, so
I wrote my own. But for some reason, my version turned out slower that
ALL of the versions in the website, even though it seems to perform less
operations (yes, I benchmarked them on my own computer).
According to the website, the slowest version is:
#include <set>
#include <string>
#include <iostream>
int main(int argc, char **argv)
{
// Declare and Initialize some variables std::string word;
std::set<std::stringwordcount;
// Read words and insert in rb-tree
while (std::cin >word) wordcount.insert(word); // Print the
result
std::cout << "Words: " << wordcount.size() << std::endl; return
0;
}
My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:
[snip]
So, since your version uses "lower-level constructs" you assume
it would be automatically faster?
--
OU
Remember 18th of June 2008, Democracy died that afternoon.
http://frapedia.se/wiki/Information_in_English
s0****@gmail.com writes:The task: Write a program that reads a set of words from standard
input and prints the number of distinct words.
I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations,
so I wrote my own. But for some reason, my version turned out slower
that ALL of the versions in the website, even though it seems to
perform less operations (yes, I benchmarked them on my own computer).
According to the website, the slowest version is:
#include <set>
#include <string>
#include <iostream>
int main(int argc, char **argv)
{
// Declare and Initialize some variables
std::string word;
std::set<std::stringwordcount;
// Read words and insert in rb-tree
while (std::cin >word) wordcount.insert(word);
// Print the result
std::cout << "Words: " << wordcount.size() << std::endl;
return 0;
}
My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:
[snip]
// Inserts a word into the set if it isn''t in the set.
// The passed string is expected to have been allocated with
// a memory allocation function, and it should be considered
// lost after passed to this function.
//
static void
InsertWord(char *aWord)
{
struct SetNode *node;
for (node = gSet; node; node = node->next) {
if (strcmp(node->word, aWord) == 0) {
free(aWord);
return;
}
}You represent your set of words as a linked list. You compare each
new word to every word already in the set. The C++ solution uses a
std::set which, if I recall correctly, can do searches and insertions
in O(n log n).
If you re-write this to use a balanced binary tree, such as an AVL
tree, you should get performance similar to the C++ version.
node = (struct SetNode *) malloc(sizeof(struct SetNode));Not incorrect, but
node = malloc(sizeof *node);
would be better.
if (!node) {
free(aWord);
return;
}And if malloc fails, you quietly return without doing anything to
handle the error or report it to the user.
[...]
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
这篇关于基准的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!