最小的文字片段,其中k关键词 [英] Smallest text fragment with k keywords

查看:134
本文介绍了最小的文字片段,其中k关键词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的朋友问的这个问题采访。 给定一个文本文档(比如一个新闻文章)和一组关键词(为前谷歌搜索词),找到最小的段,其中包含这些关键字(按任何顺序)的文本文档中

My friend was asked this question in a interview. Given a text document ( say an news article ) and a set of keywords (for ex google search terms), find the smallest segment in the text document which contains these keywords (in any order).

我能想到的方法是使用含有关键字和文本关键词的位置的地图。然后,我们用这个地图选择文本片段。不过,我一直在寻找更好的方法。

The method I could think of is to use a map containing the keyword and the position of the keyword in the text. We then use this map to select the text fragment. But I was looking for better methods.

推荐答案

我可能会提出这样的事情:

I would probably have proposed something like that:

1. create a new map of <String, Integer> of size k to associate each keyword 
     with its number of occurences in the text
2. go through the text and each time you encounter a keyword, increment its 
     number of occurences in the map
3. check that each keyword has been encountered at least once, otherwise 
     return a proper statement
4. start from the beginning of the text and each time you encounter a keyword, 
     decrement its number of occurences:
    4.1. if the number reaches 0, stop iterating and set 
           int startIndex = currentIndex
    4.2. otherwise keep iterating
5. start from the end of the text and each time you encounter a keyword, 
     decrement its number of occurences:
    5.1. if the number reaches 0, stop iterating and set 
           int endIndex = currentIndex
    5.2 otherwise keep reverse iterating
6. return [startIndex, endIndex] 

整体复杂度为O(n)。

The overall complexity is O(n).

这篇关于最小的文字片段,其中k关键词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆