将字符串分割成字 [英] Split string into words

查看：124 发布时间：2015/11/30 16:08:36 algorithm string dictionary word substring

本文介绍了将字符串分割成字的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在寻找最有效的算法来形成文字的所有可能的组合从一个字符串。例如：

 输入字符串：forevercarrot

输出：

永远的红萝卜
永远的车腐
永远胡萝卜
永远的车腐

（所有话应该是从字典）。

我能想到的蛮力的方法。（查找所有可能的子字符串和匹配），但会有什么更好的办法？

解决方案

使用 preFIX树您已知单词的列表。也许库如中的myspell 已经这样做。尝试使用现成的。

一旦你找到了一个匹配（例如'车'），分割你的计算：一个分支开始寻找一个新字（'腐'），另一个继续探索当前开头（胡萝卜）的变种

你有效地保持对队列（start_position，current_position）偏移到你的字符串每次分割计算。多个线程可以从此队列并行流行，并尝试继续从 start_position 启动和已经知道可达 current_position ，但还没有结束。当找到一个单词，据报道与另一对从该队列弹出。如果这是不可能的，没有结果的产生。当一个分裂发生时，一个新的对被添加到队列的末尾。最初，队列中包含（0,0）。

I am looking for the most efficient algorithm to form all possible combinations of words from a string. For example:

Input String: forevercarrot

Output:

forever carrot
forever car rot
for ever carrot
for ever car rot

(All words should be from a dictionary).



I can think of a brute force approach. (find all possible substrings and match) but what would be better ways?
 解决方案 
Use a prefix tree for your list of known words. Probably libs like myspell already do so. Try using a ready-made one.

Once you found a match (e.g. 'car'), split your computation: one branch starts to look for a new word ('rot'), another continues to explore variants of current beginning ('carrot').

Effectively you maintain a queue of pairs (start_position, current_position) of offsets into your string every time you split the computation. Several threads can pop from this queue in parallel and try to continue a word that starts from start_position and is already known up to current_position of the pair, but does not end there. When a word is found, it is reported and another pair is popped from the queue. When it's impossible, no result is generated. When a split occurs, a new pair is added to the end of the queue. Initially the queue contains a (0,0).

                        这篇关于将字符串分割成字的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

将字符串分割成字 [英] Split string into words

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

将字符串分割成字 [英] Split string into words

问题描述

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭