解析莫尔斯电码 [英] Parsing morse code

查看:101
本文介绍了解析莫尔斯电码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决此问题. 目标是在给定单词词典的情况下,确定莫尔斯弦的解释方式. 我所做的是,我首先将字典中的单词翻译"为莫尔斯语.然后,我使用了一个朴素的算法,寻找可以递归解释它的所有方式.

I am trying to solve this problem. The goal is to determine the number of ways a morse string can be interpreted, given a dictionary of word. What I did is that I first "translated" words from my dictionary into morse. Then, I used a naive algorithm, searching for all the ways it can be interpreted recursively.

#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <iterator>
using namespace std;

string morse_string;
int morse_string_size;
map<char, string> morse_table;
unsigned int sol;

void matches(int i, int factor, vector<string> &dictionary) {
    int suffix_length = morse_string_size-i;
    if (suffix_length <= 0) {
        sol += factor;
        return;
    }
    map<int, int> c;
    for (vector<string>::iterator it = dictionary.begin() ; it != dictionary.end() ; it++) {
        if (((*it).size() <= suffix_length) && (morse_string.substr(i, (*it).size()) == *it)) {
            if (c.find((*it).size()) == c.end())
                c[(*it).size()] = 0;
            else
                c[(*it).size()]++;
        }
    }

    for (map<int, int>::iterator it = c.begin() ; it != c.end() ; it++) {
        matches(i+it->first, factor*(it->second), dictionary);
    }
}

string encode_morse(string s) {
    string ret = "";
    for (unsigned int i = 0 ; i < s.length() ; ++i) {
        ret += morse_table[s[i]];
    }
    return ret;
}

int main() {
    morse_table['A'] = ".-"; morse_table['B'] = "-..."; morse_table['C'] = "-.-."; morse_table['D'] = "-.."; morse_table['E'] = "."; morse_table['F'] = "..-."; morse_table['G'] = "--."; morse_table['H'] = "...."; morse_table['I'] = ".."; morse_table['J'] = ".---"; morse_table['K'] = "-.-"; morse_table['L'] = ".-.."; morse_table['M'] = "--"; morse_table['N'] = "-."; morse_table['O'] = "---"; morse_table['P'] = ".--."; morse_table['Q'] = "--.-"; morse_table['R'] = ".-."; morse_table['S'] = "..."; morse_table['T'] = "-"; morse_table['U'] = "..-"; morse_table['V'] = "...-"; morse_table['W'] = ".--"; morse_table['X'] = "-..-"; morse_table['Y'] = "-.--"; morse_table['Z'] = "--..";
    int T, N;
    string tmp;
    vector<string> dictionary;
    cin >> T;

    while (T--) {
        morse_string = "";
        cin >> morse_string;
        morse_string_size = morse_string.size();
        cin >> N;
        for (int j = 0 ; j < N ; j++) {
            cin >> tmp;
            dictionary.push_back(encode_morse(tmp));
        }

        sol = 0;
        matches(0, 1, dictionary);
        cout << sol;

        if (T)
            cout << endl << endl;
    }

    return 0;
}

现在的事情是,我只允许3秒的执行时间,而我的算法在此时间限制内将无法工作.

Now the thing is that I only have 3 seconds of execution time allowed, and my algorithm won't work under this limit of time.

这是执行此操作的好方法吗?如果是,我想念的是什么?否则,您能否给出什么是好的策略的提示?

Is this the good way to do this and if so, what am I missing ? Otherwise, can you give some hints about what is a good strategy ?

字典中最多可以有10000个单词,而摩尔斯字符串中最多可以有1000个字符.

EDIT : There can be at most 10 000 words in the dictionary and at most 1000 characters in the morse string.

推荐答案

结合了动态编程和的解决方案滚动哈希应该可以解决此问题.

A solution that combines dynamic programming with a rolling hash should work for this problem.

让我们从一个简单的动态编程解决方案开始.我们分配一个向量,该向量将用于存储morse_string前缀的已知计数.然后,我们遍历morse_string,并在每个位置处遍历所有单词,然后回头查看它们是否适合morse_string.如果它们适合,那么我们将使用动态编程向量来确定构建morse_stringi-dictionaryWord.size()

Let's start with a simple dynamic programming solution. We allocate an vector which we will use to store known counts for prefixes of morse_string. We then iterate through morse_string and at each position we iterate through all words and we look back to see if they can fit into morse_string. If they can fit then we use the dynamic programming vector to determine how many ways we could have build the prefix of morse_string up to i-dictionaryWord.size()

vector<long>dp;
dp.push_back(1);
for (int i=0;i<morse_string.size();i++) {
   long count = 0;
   for (int j=1;j<dictionary.size();j++) {
       if (dictionary[j].size() > i) continue;
       if (dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)) {
           count += dp[i-dictionary[j].size()];
       }
   }
   dp.push_back(count);
}
result = dp[morse_code.size()]

此解决方案的问题是速度太慢.假设Nmorse_string的长度,而Mdictionary的大小,而K是字典中最大单词的大小.它将执行O(N*M*K)操作.如果我们假设K=1000这是关于10^10的操作,那么在大多数计算机上这太慢了.

The problem with this solution is that it is too slow. Let's say that N is the length of morse_string and M is the size of the dictionary and K is the size of the largest word in the dictionary. It will do O(N*M*K) operations. If we assume K=1000 this is about 10^10 operations which is too slow on most machines.

K成本来自dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)

如果我们可以将此字符串匹配速度提高到恒定或对数复杂度,那就可以了.这就是滚动哈希的来源.如果您构建morse_string的滚动哈希数组,则可以在O(1)中计算morse_string的任何子字符串的哈希.这样您就可以执行hash(dictionary[j]) == hash(morse_string.substring(i-dictionary[j].size(),i))

If we could speed up this string matching to constant or log complexity we would be okay. This is where rolling hashing comes in. If you build a rolling hash array of morse_string then the idea is that you can compute the hash of any substring of morse_string in O(1). So you could then do hash(dictionary[j]) == hash(morse_string.substring(i-dictionary[j].size(),i))

这很好,但是在散列不完善的情况下,您可能会从字典中获得多个具有相同散列的单词.这意味着在获得哈希匹配之后,您仍然需要匹配字符串和哈希.在编程竞赛中,人们通常会假设哈希是完美的,而跳过字符串匹配.这通常是一个安全的选择,尤其是在小词典上.如果它不能产生完美的哈希(您可以在代码中检入),则可以随时稍微调整哈希函数,也许调整后的哈希函数将产生完美的哈希.

This is good but in the presence of imperfect hashing you could have multiple words from the dictionary with the same hash. That would mean that after getting a hash match you would still need to match the strings as well as the hashes. In programming contests, people often assume perfect hashing and skip the string matching. This is often a safe bet especially on a small dictionary. In case it doesn't produce a perfect hashing (which you can check in code) you can always adjust your hash function slightly and maybe the adjusted hash function will produce a perfect hashing.

这篇关于解析莫尔斯电码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆