需要B-Tree建议 [英] B-Tree advice needed

查看:53
本文介绍了需要B-Tree建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好

我正在为T-SQL制作一个语法高亮显示器,为了速度,我要将

字硬编码到它里面(我会当SQL的下一个版本问世时,可能已经为新版本提供了足够的新功能。

我的主要算法将遍历所有的单词RICHEDIT

控制,并将它找到的每个有效单词传递给单词检查功能,

我想尽可能快地操作,因为它会被频繁调用。对于

这些目的,我打算尝试使用基于b树的算法。我的

想的是,如果说,它必须找到的词(假设)得到,

设置,并选择,那么它可能是
int getwordtype(word)

{

if((word [0] ==''g'')&&(word [ 1] ==''e'')&&(word [2] ==''t''))

返回WORDTYPE_GET;

if((单词[0] ==''s'')&&(单词[1] ==''e''))

{

if((单词[2] ==''t''))返回WORDTYPE_SET;

if((word [3] ==''l'')&&(word [4] == ''e'')&&(word [5] ==''c'')&&

(word [6] ==''t''))return WORDTYPE_SELECT;

}

返回WORDTYPE_NONE;

}


这是最快的方法吗?我确实想实现一个b树。


显然我不会用SQL中的所有单词

来输出整个算法,我是要用C#编写一个程序来生成.cpp文件,

会保存它。


任何提示?


谢谢!

解决方案

Bonj写道:

你好
我正在制作一个用于T-SQL的语法高亮显示器,为了速度,我将把这些单词硬编码到它中(我可能会为下一个新版本的新版本提供足够的新功能) SQL的版本出来了。
我的主要算法将遍历RICHEDIT
控件中的所有单词,并将它找到的每个有效单词传递给单词检查
函数我想尽可能快地操作,因为它会被频繁调用。出于这些目的,我打算尝试使用基于b树的算法。我的想法是,如果说,它有找到的词(假设)得到,设置和选择,那么它可能是
int getwordtype(word)
{
if((word [0] ==''g'')&&(word [1] ==''e'')&&(word [2] = =''t''))
返回WORDTYPE_GET;
if((word [0] =='s'')&&(word [1] ==''e''' ))
{
if((word [2] ==''t''))返回WORDTYPE_SET;
if((word [3] ==''l'')& ;&(word [4] ==''e'')&&(word [5] ==''c'')&&
(word [6] =='' t''))返回WORDTYPE_SELECT;
}
返回WORDTYPE_NONE;
}
这是最快的方法吗?我确实想要实现一个b树。

显然我不会用SQL中的所有单词输出整个算法,我将用C#编写一个程序来生成将保留它的
.cpp文件。




你想要的是TST - 三元搜索树 - 不是b树。


做一些谷歌搜索,你会发现很多参考。这是一个

描述Java中的TST算法。

http://www.javaworld.com/javaworld/j...6-ternary.html


至于生成内联C ++代码来实现算法,您可能需要

直接写入它并将其计时。有可能

全内联版本不会更快,甚至可能会因为

大大增加代码大小和分支速率而变慢。


-cd


用于符号表查找的传统结构是一个哈希表。

您可以直接使用std :: hash_map。


Keith MacDonald


" Bonj" <博** @ discussions.microsoft.com>在消息中写道

news:9C ********************************** @ microsof t.com ...

你好
我正在为T-SQL制作语法荧光笔,为了速度,我要将
字硬编码到它里面(当下一个SQL版本问世时,我可能已经想到了新版本的新功能。)
我的主要算法将遍历RICHEDIT中的所有单词
控制,并将它找到的每个有效单词传递给单词检查功能,
我想尽快操作,因为它会被频繁调用。为了这些目的,我打算尝试使用基于b树的算法。我的想法是,如果,例如,它必须找到的词(假设)
得到,
设置,并选择,那么它可能是形式
int getwordtype (字)
{
if((word [0] ==''g'')&&(word [1] ==''e'')&&(word [2] ==''t''))
返回WORDTYPE_GET;
if((word [0] =='s'')&&(word [1] ==' 'e''))
{
if((word [2] ==''t''))返回WORDTYPE_SET;
if((word [3] ==''l '')&&(word [4] ==''e'')&&(word [5] ==''c'')&&
(word [6] ==''t''))返回WORDTYPE_SELECT;
}
返回WORDTYPE_NONE;
}
这是最快的方法吗?我确实想要实现一个b树。

显然我不会用SQL中的所有
字来输出整个算法,我要写一个程序在C#中生成.cpp文件
将保留它。

任何提示?

谢谢!


Keith MacDonald写道:

用于符号表查找的传统结构是一个哈希表。您可以直接使用std :: hash_map。




对于一组固定的键(例如语言中的关键字),TST通常会给出

比哈希表好得多的性能。然而,哈希表通常是用于动态符号表的



-cd


Hello
I am making a syntax highlighter for T-SQL, and I am going to hardcode the
words into it for speed''s sake (I will probably have thought up enough new
features for a new version when the next version of SQL comes out).
My main algorithm is going to go through all the words in the RICHEDIT
control, and pass each valid word it finds to a word-checking function, which
I want to operate as fast as possible as it will be called frequently. For
these purposes, I was going to try and have a b-tree based algorithm. My
thinking is that if, say, the words it has to find are (hypothetically) get,
set, and select, then it could be of the form
int getwordtype(word)
{
if((word[0] == ''g'') && (word[1] == ''e'') && (word[2] == ''t''))
return WORDTYPE_GET;
if((word[0] == ''s'') && (word[1] == ''e''))
{
if((word[2] == ''t'')) return WORDTYPE_SET;
if((word[3] == ''l'') && (word[4] == ''e'') && (word[5] == ''c'') &&
(word[6] == ''t'')) return WORDTYPE_SELECT;
}
return WORDTYPE_NONE;
}

would this the fastest method? I do want to implement a b-tree.

Obviously I am not going to type out the whole algorithm with all the words
in SQL, I am going to write a program in C# to generate the .cpp file which
will hold it.

Any tips?

Thanks!

解决方案

Bonj wrote:

Hello
I am making a syntax highlighter for T-SQL, and I am going to
hardcode the words into it for speed''s sake (I will probably have
thought up enough new features for a new version when the next
version of SQL comes out).
My main algorithm is going to go through all the words in the RICHEDIT
control, and pass each valid word it finds to a word-checking
function, which I want to operate as fast as possible as it will be
called frequently. For these purposes, I was going to try and have a
b-tree based algorithm. My thinking is that if, say, the words it has
to find are (hypothetically) get, set, and select, then it could be
of the form
int getwordtype(word)
{
if((word[0] == ''g'') && (word[1] == ''e'') && (word[2] == ''t''))
return WORDTYPE_GET;
if((word[0] == ''s'') && (word[1] == ''e''))
{
if((word[2] == ''t'')) return WORDTYPE_SET;
if((word[3] == ''l'') && (word[4] == ''e'') && (word[5] == ''c'') &&
(word[6] == ''t'')) return WORDTYPE_SELECT;
}
return WORDTYPE_NONE;
}

would this the fastest method? I do want to implement a b-tree.

Obviously I am not going to type out the whole algorithm with all the
words in SQL, I am going to write a program in C# to generate the
.cpp file which will hold it.



What you want is a TST - Ternary Search Tree - not a b-tree.

Do some Googling and you''ll find numerous references. Here''s one that
describes TST algorithms in Java.

http://www.javaworld.com/javaworld/j...6-ternary.html

As for generating inline C++ code to implement the algorithm, you might want
to write it straightforwardly firsrt and time it. It''s possible that the
all-inline version won''t be faster, and could even be slower due to the
greatly increased code size and rate of branching.

-cd


The conventional structure used for symbol table lookup, is a hash table.
You could use std::hash_map straight out of the box.

Keith MacDonald

"Bonj" <Bo**@discussions.microsoft.com> wrote in message
news:9C**********************************@microsof t.com...

Hello
I am making a syntax highlighter for T-SQL, and I am going to hardcode the
words into it for speed''s sake (I will probably have thought up enough new
features for a new version when the next version of SQL comes out).
My main algorithm is going to go through all the words in the RICHEDIT
control, and pass each valid word it finds to a word-checking function,
which
I want to operate as fast as possible as it will be called frequently. For
these purposes, I was going to try and have a b-tree based algorithm. My
thinking is that if, say, the words it has to find are (hypothetically)
get,
set, and select, then it could be of the form
int getwordtype(word)
{
if((word[0] == ''g'') && (word[1] == ''e'') && (word[2] == ''t''))
return WORDTYPE_GET;
if((word[0] == ''s'') && (word[1] == ''e''))
{
if((word[2] == ''t'')) return WORDTYPE_SET;
if((word[3] == ''l'') && (word[4] == ''e'') && (word[5] == ''c'') &&
(word[6] == ''t'')) return WORDTYPE_SELECT;
}
return WORDTYPE_NONE;
}

would this the fastest method? I do want to implement a b-tree.

Obviously I am not going to type out the whole algorithm with all the
words
in SQL, I am going to write a program in C# to generate the .cpp file
which
will hold it.

Any tips?

Thanks!



Keith MacDonald wrote:

The conventional structure used for symbol table lookup, is a hash
table. You could use std::hash_map straight out of the box.



For a fixed set of keys (e.g. keywords in a language), a TST typically gives
far better performance than a hash table. Hash tables are very commonly
used for dynamic symbol tables, however.

-cd


这篇关于需要B-Tree建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆