将电话号码编码到字和搜索字典 [英] Encoding phone number to word and searching dictionary

查看:148
本文介绍了将电话号码编码到字和搜索字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个作业。我已经尝试通过堆栈溢出搜索。



我需要阅读一个充满电话号码的文本文件。这部分我知道如何做。



接下来,我需要根据以下编码将电话号码转换为所有可能的字排列:

  E | J N Q | R W X | D S Y | F T | A M | C I V | B K U | L O P | G H Z 

e | j n q | r w x | d s y | f t |一个m | c i v | b k u | l o p | g h z

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

并检查它们与我被给予阅读的字典文件。 (阅读部分我已经知道如何做)



有人可以建议一个很好的算法来编码电话号码和检查字典吗?



迄今为止,我尝试过的一切都抛出了一个OutOfMemory异常,并且赋值指定不一定是这样。

解决方案

这里是:

  static char [] [] letters = { Ee.toCharArray(),// 0 
JNQjnq.toCharArray(),// 1
RWXrwx.toCharArray()// 2
// ....添加剩余
};

static Set< String> words = new HashSet(Arrays.asList(
Stack,
Overflow,
foo,
bar
//。添加其他词
));

// main方法
public void generateAndCheckAll(String number){
int [] num = new int [number.length()]; //存储号码为数字
int [] ind = new int [number.length()]; // store letter indices
int i = 0;
for(char c:number.toCharArray()){//将String转换为int数组
num [i ++] = c - '0';
}
do {
String word = generateWord(num,ind);
System.out.println(word +(words.contains(word)?包含:不包含));
} while(next(num,ind));
}


//通过数组和索引数组生成单词
public String generateWord(int [] num,int [] ind){
StringBuilder sb = new StringBuilder(); (int i = 0; i< ind.length; i ++)
sb.append(letters [num [i]] [ind [i]]);

return sb.toString();
}


//将索引数组切换到下一个
public boolean next(int [] num,int [] ind){
int i = ind.length - 1;
while(i> = 0&& ind [i] == letters [num [i]]。length - 1)//搜索可以增加的索引
i-- ;
if(i< 0)
return false;
ind [i] ++; //增加这样的索引
for(int j = i + 1; j< num.length; j ++)//将剩余索引更改为0
ind [j] = 0;
返回true;
}

主要思想是在每次迭代中存储字母索引数组,并生成一个基于它的新词。例如,对于号码 201 和索引数组 [0,0,5] 单词将生成REq :数字的第0个字母 2 ,数字的第0个字母 0 和第五个字母的数字 1



然后,这个词在字典中被查找,索引数组被切换到下一个。从最后扫描我们搜索可以增加的索引(这样不会在给定位置的数字溢出字母数):我们不能增加5(因为数字只有6个字母 1 ),但是我们可以增加0。所以索引数组更改为 [0,1,0] ,一切都重复一遍,直到所有索引都被允许为最大值。



要查看它是如何工作的,请逐步调试此代码。对于许多其他类似的任务来说,真的很容易理解并应用相同的想法。






顺便说一句,这不是自从字典大小(我认为10-100K)以来,最有效的方法很可能会小于组合计数(10位数的约60M)。如果您不需要生成所有组合,则可以执行字典的准备:


  1. 为每个单词生成一个数字在字典中,


  2. 将每个生成的数字映射到相应的单词(单词),并收集哈希集合中的映射。


然后,对于您收到的号码,您只需查看给定号码是否存在单词(单词)。这种方法需要较长的准备阶段和更多的空间,但查找速度显着增加。


I have an assignment. I have tried searching through stack overflow.

I need to read a text file full of phone numbers. This part I know how to do.

Next I need to convert the phone numbers to all possible permutations of words based on the following encoding:

E | J N Q | R W X | D S Y | F T | A M | C I V | B K U | L O P | G H Z

e | j n q | r w x | d s y | f t | a m | c i v | b k u | l o p | g h z

0 |   1   |   2   |   3   |  4  |  5  |   6   |   7   |   8   |   9

and check them against a dictionary file which i am given to read. (the reading part I already know how to do)

Can someone please suggest a good algorithm for encoding the phone numbers and checking against a dictionary?

Everything I have tried so far threw a OutOfMemory exception and the assignment specifies that this must not be the case.

解决方案

Here it is:

static char[][] letters = {"Ee".toCharArray(),      // 0
                           "JNQjnq".toCharArray(),  // 1
                           "RWXrwx".toCharArray()   // 2
                           //.... add remaining
                          };

static Set<String> words = new HashSet<>(Arrays.asList(
                           "Stack", 
                           "Overflow", 
                           "foo", 
                           "bar" 
                           //... add other words
                           ));

// main method
public void generateAndCheckAll(String number) {
    int[] num = new int[number.length()];         // store number as digits
    int[] ind = new int[number.length()];         // store letter indices
    int i = 0;
    for (char c : number.toCharArray()) {         // convert String to int array
        num[i++] = c - '0';
    }
    do {
        String word = generateWord(num, ind);
        System.out.println(word + (words.contains(word) ? " is contained" : " is not contained"));
    } while (next(num, ind));
}


// generate word by number array and index array
public String generateWord(int[] num, int[] ind) {
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < ind.length; i++) 
        sb.append(letters[num[i]][ind[i]]);
    return sb.toString();
}


// switch index array to next
public boolean next(int[] num, int[] ind) {
    int i = ind.length - 1;
    while (i >= 0 && ind[i] == letters[num[i]].length - 1) // search for the index that can be incremented
        i--;
    if (i < 0)
        return false;
    ind[i]++;                                              // increment such index
    for (int j = i + 1; j < num.length; j++)               // change remaining indices to 0
        ind[j] = 0;
    return true;
}

Main idea here is to store array of letter indices on every iteration and generate a new word based on it. For example, for number 201 and index array [0, 0, 5] the word REq will be generated: 0-th letter for digit 2, 0-th letter for digit 0 and 5-th letter for digit 1.

Then this word is looked-up in dictionary and the index array is switched to next. Scanning from the end we search index that can be incremented (so that will not overflow number of letters for the digit at the given position): we cannot increment "5" (because there are only 6 letters for digit 1), but we can increment "0". So index array is changed to [0, 1, 0] and everything is repeated again and again until all indices are maximum allowed.

To see how it works in action, debug this code in step-by-step mode. It is really easy to understand and apply the same idea to many other similar tasks.


By the way this is not the most effective approach since dictionary size (10-100K I suppose) is very likely to be smaller than combination count (~60M for 10-digit number). If you don't really need to generate all combinations, you may perform preparation of dictionary:

  1. Generate a number for each word in dictionary,

  2. map each generated number to corresponding word (words) and collect those mapping in hash collection.

Then, for number you receive, you just look if there is a word (words) exists for a given number. This method requires long preparation stage and more space, but look-up speed is increased significantly.

这篇关于将电话号码编码到字和搜索字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆