检查是否字可以做出来给信​​件快 [英] Check if word can be made out of given letters fast

查看:94
本文介绍了检查是否字可以做出来给信​​件快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些信件和频率计数。我有一个很长的单词列表(1M说)。

I have some letters and frequency counts. And I have a very long list of words (1M say).

假设我有 A-1,B-1,D-1 (最多一个 A ,最多一个 B ,最多一个 D ),那么我可以让BAD ,而不是RAD

Suppose I have A-1, B-1, D-1 ("at most one A, at most one B, at most one D"), then I can make "BAD", but not "RAD"

我可以知道哪些话可以造出来的信件,在对数的时间,或者类似的东西,而不是通过所有单词迭代,看着每个字母的计数的单词?

Can I know in which words can be made out of those letters, in logarithmic time, or something like that, instead of iterating through all words and looking at the counts of each letter in the word?

可用于这些话是什么数据结构?一个线索可能?我不知道他们。这也将是巨大的,如果我可以存储所需的每个单词与它的字母。请帮帮忙!

What data structure can be used for these words? A trie maybe? I'm unaware of them. It would also be great if I can store letters required for each word with it. Please help!

推荐答案

下面是一个数据结构(字面)草图。

Here's a (literal) sketch of a data structure.

             [root]
         ----- | -----
       A1      A2     B1 ...
  ----/-    ---|---    -\----
 B1 C1 [a]  B1 B2 C1  C1 C2 D2 ...

这是一棵树,那里的叶节点都是在单词表中的单词。在叶节点的字字母组成的从根到该节点的路径的袋子正好组成。非叶节点都标有一个字母和一个计数。一个节点的子节点必须是叶(字)或有一封严格后来在字母表。因此,要以猫,你走的路径 A1,C1,T1 (和的行为)将是T1的一个孩子。在每个节点,您遍历具有数≤您输入计数(所以袋孩子 A3,C1,T2 ,你会遍历标记A1,A2,A3的任何节点,C1,T1或T2)。

It's a tree, where the leaf nodes are the words in the word list. The words at a leaf node are composed exactly of the bag of letters consisting of the path from the root to that node. Non-leaf nodes are labelled with a letter and a count. A child of a node must either be a leaf (a word) or have a letter strictly later in the alphabet. So, to get to "cat", you go down the path A1,C1,T1, and cat (and act) will be a child of T1. At each node, you traverse the children which have count ≤ your input count (so for the bag A3, C1, T2, you would traverse any node labelled A1,A2,A3, C1, T1 or T2).

遍历需要在最坏情况下O(n)时间(每个字匹配),但平均花费少得多。对于一个小的输入包,它将只遍历几个节点。对于一个大的输入包,它遍历多个节点,但它也可以找到很多的话。

The traversal takes O(n) time in the worst case (every word matches), but on average takes substantially less. For a small input bag, it will only traverse a few nodes. For a large input bag, it traverses many nodes, but it will also find many words.

的树包含的至多的在单词一览每个字母一个节点,所以它将具有大小至多成正比的单词表的长度。

The tree contains at most one node per letter in the wordlist, so it will have size at most proportional to the length of the wordlist.

这是可以计算并比较容易地存储在时间和节省空间的结构 - 它不会采取更多的空间比你的词库,并查询pretty的快

This is a time- and space- efficient structure which can be computed and stored relatively easily -- it won't take much more space than your wordlist, and queries pretty fast.

这篇关于检查是否字可以做出来给信​​件快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆