确定一个字符串是否是另一个的前缀 [英] Determine if one string is a prefix of another

查看:145
本文介绍了确定一个字符串是否是另一个的前缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写下了一个简单的函数,该函数确定str1是否为str2的前缀.这是一个非常简单的函数,看起来像这样(在JS中):

I have written down a simple function that determines if str1 is a prefix of str2. It's a very simple function, that looks like this (in JS):

function isPrefix(str1, str2) // determine if str1 is a prefix of a candidate string
{
    if(str2.length < str1.length) // candidate string can't be smaller than prefix string 
        return false;

    var i = 0;
    while(str1.charAt(i) == str2.charAt(i) && i <= str1.length)
        i++;
   if(i < str1.length) // i terminated => str 1 is smaller than str 2
        return false;
    return true;
}

如您所见,它将遍历前缀字符串的整个长度,以衡量其是否为候选字符串的前缀.这意味着它的复杂度为O(N),这很不错,但是当我有一个庞大的数据集要考虑循环以确定哪些字符串将前缀字符串作为前缀的一部分时,这将成为一个问题.这使得复杂度变为O(M * N)的倍数,其中M是给定数据集中的字符串总数.不好.

As you can see, it loops through the entire length of the prefix string to gauge if it is a prefix of the candidate string. This means it's complexity is O(N), which isn't bad but this becomes a problem when I have a huge data set to consider looping through to determine which strings have the prefix string as a part of the prefix. This makes the complexity multiple like O(M*N) where M is the total number of strings in a given data set. Not good.

我稍微浏览了一下互联网,以确定最好的答案是Patricia/Radix trie.字符串存储为前缀的位置.即使这样,当我尝试插入/查找字符串时,如果使用上述前缀测量功能,字符串匹配也会有相当大的开销.

I explored the Internet a bit to determine that the best answer would be a Patricia/Radix trie. Where strings are stored as prefixes. Even then, when I attempt to insert/look-up a string, there will be a considerable overhead in string matching if I use the aforementioned prefix gauging function.

说我有一个前缀字符串'rom'和一组候选单词

Say I had a prefix string 'rom' and a set of candidate words

var数据集= [随机",快速",浪漫",罗马尼亚",罗马",玫瑰"];

var dataset =["random","rapid","romance","romania","rome","rose"];

想要在基数基里这样的

         r
       /    \
     a       o
    / \     / \
ndom pid  se  m
             / \
           an   e
          /  \
        ia   ce

这意味着,对于每个节点,我将使用前缀匹配功能来确定哪个节点的值与索引处的前缀字符串匹配.不知何故,这种解决方案似乎仍然很艰巨,并且与我的配合不太好.有什么更好的方法,还是我可以改善核心前缀匹配功能?

This means, for every node, I will be using the prefix match function, to determine which node has a value that matches the prefix string at the index. Somehow, this solution still seems arduous and does not sit too well with me. Is there something better or anyway I can improve the core prefix matching function ?

推荐答案

好像有两个不同的问题.

Looks like you've got two different problems.

一个是确定一个字符串是否包含在另一个字符串中作为前缀.为此,我建议使用已在该语言的字符串库中实现的功能.在JavaScript中,您可以这样做

One is to determine if a string is contained as a prefix in another string. For this I would suggest using a function already implemented in the language's string library. In JavaScript you could do this

if (str2.indexOf(str1) === 0) {
    // string str1 is a prefix of str2
}

在此处查看String.indexOf的文档: https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/String/indexOf

See documentation for String.indexOf here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/indexOf

对于另一个问题,在一堆字符串中,找出哪些字符串具有给定的字符串作为前缀,如果想要快速查找,则构建一个类似于Trie的数据结构,或者您提到的那种结构似乎是可行的方法-ups.

For the other problem, in a bunch of strings, find out which ones have a given string as a prefix, building a data structure like a Trie or the one you mention seems like the way to go, if you want fast look-ups.

这篇关于确定一个字符串是否是另一个的前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆