如何使用PHP以任何顺序(从12个字母组成6个单词组成一个单词)进行字符搜索? [英] How to do a character search in any order (12 letters from which 6 should form a word) with PHP?

查看:162
本文介绍了如何使用PHP以任何顺序(从12个字母组成6个单词组成一个单词)进行字符搜索?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我整天都在想这件事,似乎无法找出一种高效而快速的内存存储方式. 问题是:

I am thinking about this all day and can't seem to figure out an memory efficient and speedy way. The problem is:

例如,我有以下字母: e f j l n r r t t u w x(12个字母)

for example, I have these letters: e f j l n r r t t u w x (12 letters)

我正在寻找这个词 龟(6个字母)

I am looking for this word TURTLE (6 letters)

如何用php查找完整范围(12个单词)中的所有可能单词? (或者使用python,如果那样可能会容易得多?)

How do I find all the possible words in the full range (12 words) with php? ( Or with python, if that might be a lot easier? )

我尝试过的事情:

  • 使用置换:我已经使用置换算法使所有字符串成为可能,将它们放入数组(仅长6个字符),并执行in_array来检查它是否与我的数组中的单词之一匹配并有效个单词(在这种情况下,包含TURTLE,但有时为两个或三个单词). 这种计算会花费大量的内存和时间,尤其是要使用6个以上的字符才能进行排列.

  • Using permutations: I have made all strings possible using a permutation algorithm, put them in array (only the ones 6 chars long) and do an in_array to check if it matches one of the words in my array with valid words (in this case, containing TURTLE, but sometimes two or three words). This calculating costs a lot of memory and time, especially with 6+ characters to get permutations of.

创建一个正则表达式(对此我很不好).我想创建一个正则表达式来检查12个(输入)字符中的6个是否在有效数组"中的一个单词中.问题是,我们不知道第12个字母是哪个字母的起始位置和其他单词的位置.

creating a regex (I am bad at this). I wanted to create a regex to check if 6 of the 12 (input) characters are in a word from the "valid array". problem is, we don't know what letter from the 12 will be the starting position and the position of the other words.

一个例子是: http://drawsomethingwords.net/

我希望您能为我解决这个问题,因为我真的很想解决此问题. 感谢您的所有时间:)

I hope you can help me with this problem, as I would really like to fix this. Thanks for all of your time :)

推荐答案

我在编写填字游戏编辑器时遇到了类似的问题(例如,找到所有长度为5的单词,第二个位置带有"B").基本上可以归结为:

I've encountered similar problems when writing a crossword editor (e.g., find all words of length 5 with a 'B' in second position). Basically it comes down to:

  • 处理单词列表并按长度组织单词(即长度为2,长度为3,长度为4的所有单词的列表).原因是您经常知道要搜索的单词的长度.如果要搜索长度未知的单词,可以再次搜索其他单词列表.
  • 将每个单独的单词列表插入到三级搜索树中,这样可以更快地搜索单词.树中的每个节点都包含一个字符,您可以将树下移以搜索单词.还有一些专门的数据结构,例如 trie ,但我尚未(尚未)进行探索.
  • Process a word list and organize words by length (i.e., a list of all words of length 2, length 3, length 4, etc). The reason is that you often know the length of the word(s) that you wish to search for. If you want to search for words of unknown length, you can repeat a search again for a different word list.
  • Insert each separate word list into a tertiary search tree which makes searching for words a lot faster. Each node in the tree contains a character and you can descend the tree to search for words. There are also specialized data structures such as a trie but I have not (yet) explored.

现在遇到问题了,您可以使用搜索树编写搜索功能,例如

Now for your problem, you could use the search tree to write a search function such as

function findWords($tree, $letters) {
   // ...
}

其中,tree是包含您要搜索的长度的单词的搜索树,而letters是有效字符的列表.在您的示例中,letters将是字符串efjlnrrttuwx.

where tree is the search tree containing the words of the length that you wish to search for and letters is a list of valid characters. In your example, letters would be the string efjlnrrttuwx.

搜索树使您可以一次搜索一个字符的单词,并且可以跟踪到目前为止遇到的字符.只要这些字符在有效字母列表中,您就可以继续搜索.在搜索树中遇到叶子节点后​​,您将找到一个现有单词,可以将其添加到结果中.如果遇到的字符不在letters中(或已被使用),则可以跳过该单词并在搜索树中的其他位置继续搜索.

The search tree allows you to search for words, one character at a time, and you can keep track of characters that you have encountered so far. As long as these characters are in the list of valid letters, you keep searching. Once you've encountered a leaf node in the search tree, you have found an existing word which you can add to the result. If you encounter a character which is not in letters (or it has already been used), you can skip that word and continue the search elsewhere in the search tree.

我的填字游戏编辑器 Palabra 包含上述步骤的实现(一部分在Python中完成,但大部分在C中).对于Ubuntu包含大约70K个单词的默认单词列表,它的运行速度足够快.

My crossword editor Palabra contains an implementation of the above steps (a part is done in Python but mostly in C). It works fast enough for Ubuntu's default word list containing roughly 70K words.

这篇关于如何使用PHP以任何顺序(从12个字母组成6个单词组成一个单词)进行字符搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆