新的面向对象的并行模式匹配算法 [英] New object-oriented parallel pattern matching algorithm

查看:91
本文介绍了新的面向对象的并行模式匹配算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

GES算法


一种令人惊讶的简单并行模式匹配算法


部分是因为文献中提出的最佳算法<很难理解和实施,快速和实用算法的知识并不常见。


休谟和周日,Fast String Searching,软件 - 实践

和Experience,Vol。 21#11,pp 1221-48

在其他学科中,例如数学,解决复杂的问题通常会让人感到惊讶。在

软件的世界中,通常情况并非如此。最坏的情况是,软件

解决方案可能只是程序员混淆的路线图,因为他试图解决问题。在最好的情况下,这通常是b / b
揭示了程序员能够重现并且b $ b增加复杂性的自豪感,而不是直到

问题'天生的简洁性一目了然。


1985年,我被要求在我的第一个

编程中完成的任务之一工作,是创建一个用户可配置的终端

仿真程序。事实上,Dobb博士的期刊刚刚发表了一篇名为Parallel Pattern Matching and Fgrep的文章,由一位名叫Ian E. Ashdown的b $ b绅士发表了一篇名为Parallel Pattern Matching and Fgrep的文章。 。文章介绍了Aho-Corasick算法的C实现。我阅读了

文章,虽然很难修改,但是他们发布的代码改编为

创建我的终端模拟器。


几年前,我花了几个月的时间搬家,而不是编程,我正准备找工作。我决定打磨我的

C ++技能。之前使用过Aho-Corasick算法,我的目标是使用C ++以干净有效的方式实现它。

看过C中的几个实现,我他在Aho-Corasick系统的C ++中实现了一本受人尊敬的书,由Bryan Flamig编写的实用算法C ++,(1999 John Wiley&

Sons)。在阅读了这篇文章之后,我得出结论,该算法并不适合在C ++中实现干净或高效的实现。 Flaming在他的书中做了以下

评论:ConstructDFA()函数背后的逻辑

很难解释(作者无法理解它

他自己),所以我们把解释留给原来的Aho-Corasick

纸。看到这个,我就读了他们的论文。


从来没有学过FORTRAN,只有当我读完原始的

论文,并发现他们的算法是思考和说话

FORTRAN,这一切都对我有意义。我立刻想起了一个古老的编程笑话(我从未理解过的幽默),

确定的程序员可以在任何一个人写一个FORTRAN程序/>
language。我不想这样做,我问自己,只是如何

复杂可以创建一个有限状态自动机并行

模式匹配?


正如我发现的,问题本身很简单。虽然

Aho-Corasick系统已经被认为是创建并行模式匹配机器的唯一方式,但它的复杂性远远超过

超越了问题的复杂性,并且绝不是面向对象的



面向对象编程语言的引入一直是

是软件开发领域的重大进步,它允许我们以优雅的方式为我们解决问题的简单解决方案建模

。虽然

我们的系统适用于大多数当前编程语言中简单有效的实现b / b实现,但它特别适合面向对象的b $ b语言,例如C ++和

Java。 (事实上​​,我们的解决方案很简单,它可以使用三乘五索引卡实现
)!


我立即形成一家公司,Great East Software,并为我们的GES算法申请了
专利。该算法现已正在申请专利。


我们算法的简单性允许基于数组的状态转换,这意味着各种有效的实现<可以很容易地创建
。并且,在标准C ++中实现,我们创建的

原型大大优于Aho-Corasick算法的任何实现

,或者我们所拥有的其他任何东西
能够测试。


我的问题是这个 - 有没有人知道一个大型软件公司

可能对我的算法感兴趣?

The GES Algorithm

A Surprisingly Simple Algorithm for Parallel Pattern Matching

"Partially because the best algorithms presented in the literature
are difficult to understand and to implement, knowledge of fast and
practical algorithms is not commonplace."

Hume and Sunday, "Fast String Searching", Software - Practice
and Experience, Vol. 21 # 11, pp 1221-48

In other disciplines, such as mathematics, the solution to a complex
problem is frequently startling in its simplicity. In the world of
software, this is usually not the case. At its worst, a software
solution may simply be a road map of the programmer''s confusion as he
or she attempts to grasp the problem. At its best, this usually
reveals the programmer''s pride in being able to reproduce and
proliferate complexity, rather than to think things through until the
problem''s innate simplicity shines through.

In 1985, one of the tasks that I was asked to accomplish at my first
programming job, was that of creating a user configurable terminal
emulation program. As it happened, Dr. Dobb''s Journal had just
published an article called "Parallel Pattern Matching and Fgrep", by a
gentleman named Ian E. Ashdown. The article presented an
implementation, in C, of the Aho-Corasick algorithm. I read the
article and, difficult as it was, adapted the code they published to
create my terminal emulator.

A few years ago, I was about to look for a job, after having spent
several months relocating, and not programming. I decided to polish my
C++ skills. Having used the Aho-Corasick Algorithm previously, my goal
was to implement it, in a clean and efficient manner, using C++.
Having seen several implementations in C, I consulted a respected
book, Practical Algorithms in C++ by Bryan Flamig, (1999 John Wiley &
Sons) for his implementation in C++ of the Aho-Corasick system. After
reading this, I concluded that the algorithm did not lend itself to a
clean or efficient implementation in C++. Flaming makes the following
comment in his book: "The logic behind the ConstructDFA() function
is rather difficult to explain (the author has trouble understanding it
himself), so we leave the explanation to the original Aho-Corasick
paper." Upon seeing this, I read their paper.

Never having learned FORTRAN, it was only when I read their original
paper, and discovered that their algorithm was thinking and speaking
FORTRAN, that it all made sense to me. I was immediately reminded of
an old programming joke (the humor of which I have never understood),
"The determined programmer can write a FORTRAN program in any
language". Having no desire to do this, I asked myself, "Just how
complicated can the creation of a finite state automaton for parallel
pattern matching be?"

As I discovered, the problem itself is quite simple. While the
Aho-Corasick system has been, for thirty years, regarded as the only
way to create parallel pattern matching machines, its complexity far
surpasses the complexity of the problem, and it is in no way
object-oriented.

The introduction of object-oriented programming languages has been a
major step forward for the software development world, and it allows us
to model our simple solution to the problem with some elegance. While
our system lends itself to a straightforward and efficient
implementation in most current programming languages, it is
particularly well suited to object-oriented languages, such as C++ and
Java. (As a matter of fact, our solution is simple enough that it can
be implemented using three-by-five index cards)!

I immediately formed a company, Great East Software, and applied for a
patent for our GES algorithm. The algorithm is now patent-pending.

The simplicity of our algorithm allows for array-based state
transitions, which means that a variety of efficient implementations
may easily be created. And, implemented in standard C++, the
prototypes which we have created greatly outperform any implementations
of the Aho-Corasick algorithm, or anything else against which we have
been able to test.

My question is this - does anyone know of a large software company who
might be interested in my algorithm?

推荐答案

bp *** ***@greateastsoftware.com 写道:

[snip]
bp******@greateastsoftware.com wrote:
[snip]
我的问题是这个 - 有谁知道一家大型软件公司
可能对我的算法感兴趣?
My question is this - does anyone know of a large software company who
might be interested in my algorithm?




如果那是你的问题,那么你的问题就是偏离主题了。请参阅

常见问题解答,了解该小组的主题是什么和什么不是主题。

最好


Kai-Uwe Bux



If that is your question, then your question is off-topic. Please consult
the FAQ about what is and what is not off-topic in this group.
Best

Kai-Uwe Bux


bp ****** @ greateastsoftware。 com 写道:
bp******@greateastsoftware.com wrote:
(事实上,我们的解决方案非常简单,可以使用三乘五索引卡实现)!
(As a matter of fact, our solution is simple enough that it can
be implemented using three-by-five index cards)!




如果是这种情况,那么是什么让你认为你可以或应该为b $ b b专利呢?


-

Mike Smith



If that''s the case, then what makes you think that you can or should
patent it?

--
Mike Smith


Mike Smith写道:
Mike Smith wrote:
bp ****** @ greateastsoftware.com 写道:
bp******@greateastsoftware.com wrote:
(作为一个问题事实上,我们的解决方案非常简单,可以使用三乘五索引卡实现!
(As a matter of fact, our solution is simple enough that it can
be implemented using three-by-five index cards)!



如果是这种情况,那么是什么让你THI你可以或者应该申请专利吗?


If that''s the case, then what makes you think that you can or should
patent it?




三个字:一键购物。



Three Words: "One Click Shopping".

这篇关于新的面向对象的并行模式匹配算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆