查找包含数组中所有单词的字符串子字符串 [英] Finding Sub-Strings of String Containing all the words in array

查看:276
本文介绍了查找包含数组中所有单词的字符串子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串和一个单词数组,我必须编写代码来查找包含该数组中所有单词的字符串的所有子字符串,该子字符串的顺序是任意的.该字符串不包含任何特殊字符/数字,并且每个单词都用空格分隔.

I have a String and an array of words and I have to write code to find all substrings of the string that contain all the words in the array in any order. The string does not contain any special characters / digits and each word is separated by a space.

例如:

给出的字符串:

aaaa aaaa aaaa aaaa cccc bbbb bbbb bbbb bbbb aaaa bbbb cccc

数组中的单词:

aaaa
bbbb
cccc

输出示例:

aaaa aaaa aaaa aaaa cccc bbbb bbbb bbbb bbbb    

aaaa aaaa aaaa aaaa cccc bbbb    

aaaa cccc bbbb bbbb bbbb bbbb    

cccc bbbb bbbb bbbb bbbb aaaa  

aaaa cccc bbbb

我已经使用for循环实现了此功能,但这效率很低.

I have implemented this using for loops, but this is very inefficient.

我如何才能更有效地做到这一点?

How can I do this more efficiently?

我的代码:

    for(int i=0;i<str_arr.length;i++)
    {
        if( (str_arr.length - i) >= words.length)
        {
            String res = check(i);
            if(!res.equals(""))
            {
                System.out.println(res);
                System.out.println("");
            }
            reset_all();
        }
        else
        {
            break;
        }
    }

public static String check(int i)
{
    String res = "";
    num_words = 0;

    for(int j=i;j<str_arr.length;j++)
    {
        if(has_word(str_arr[j]))
        {
            t.put(str_arr[j].toLowerCase(), 1);
            h.put(str_arr[j].toLowerCase(), 1);

            res = res + str_arr[j]; //+ " ";

            if(all_complete())
            {
                return res;
            }

            res = res + " ";
        }
        else
        {
            res = res + str_arr[j] + " ";
        }

    }
    res = "";
    return res;
}

推荐答案

我的第一种方法将类似于以下伪代码

My first approach would be something like the following pseudo-code

  for word:string {
    if word in array {
      for each stored potential substring {
        if word wasnt already found {
          remove word from notAlreadyFoundList
          if notAlreadyFoundList is empty {
            use starting pos and ending pos to save our substring
          }
        }
      store position and array-word as potential substring
  }

这应该具有不错的性能,因为您只需要遍历字符串一次.

This should have decent performance since you only traverse the string once.

这是我的伪代码的实现,请尝试一下,看看它的性能是好是坏.它假定在​​找到最后一个单词后立即找到匹配的子字符串.如果您确实希望所有匹配,请更改标记为//ALLMATCHES的行:

This is an implementation of my pseudo-code, try it out and see if it performs better or worse. It works under the assumption that a matching substring is found as soon as you find the last word. If you truly want all matches, change the lines marked //ALLMATCHES:

class SubStringFinder {
    String textString = "aaaa aaaa aaaa aaaa cccc bbbb bbbb bbbb bbbb aaaa bbbb cccc";
    Set<String> words = new HashSet<String>(Arrays.asList("aaaa", "bbbb", "cccc"));

    public static void main(String[] args) {
        new SubStringFinder();
    }

    public SubStringFinder() {
        List<PotentialMatch> matches = new ArrayList<PotentialMatch>();
        for (String textPart : textString.split(" ")) {
            if (words.contains(textPart)) {
                for (Iterator<PotentialMatch> matchIterator = matches.iterator(); matchIterator.hasNext();) {
                    PotentialMatch match = matchIterator.next();
                    String result = match.tryMatch(textPart);
                    if (result != null) {
                        System.out.println("Match found: \"" + result + "\"");
                        matchIterator.remove(); //ALLMATCHES - remove this line
                    }
                }
                Set<String> unfound = new HashSet<String>(words);
                unfound.remove(textPart);
                matches.add(new PotentialMatch(unfound, textPart));
            }// ALLMATCHES add these lines 
             // else {
             // matches.add(new PotentialMatch(new HashSet<String>(words), textPart));
             // }
        }
    }

    class PotentialMatch {
        Set<String> unfoundWords;
        StringBuilder stringPart;
        public PotentialMatch(Set<String> unfoundWords, String part) {
            this.unfoundWords = unfoundWords;
            this.stringPart = new StringBuilder(part);
        }
        public String tryMatch(String part) {
            this.stringPart.append(' ').append(part);
            unfoundWords.remove(part);                
            if (unfoundWords.isEmpty()) {
                return this.stringPart.toString();
            }
            return null;
        }
    }
}

这篇关于查找包含数组中所有单词的字符串子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆