在两段文字之间找到匹配的短语? [英] Finding matching phrases between two pieces of text?

查看:101
本文介绍了在两段文字之间找到匹配的短语?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是从两段文字中找到相似的短语.

My objective is to find similar phrases from two pieces of text.

我知道惯用词会成问题.例如,and the we are the.在这种情况下,我认为将需要一个过滤器.

I know that common words will be a problem. For example, and the we are the. In that case, I think a filter will be necessary.

我想知道这是否是一个好方法?此操作使用递归,如果找到匹配项,则查看下一个单词是否也匹配,并继续进行直到没有匹配项为止.

I want to know if this was a good approach? This uses recursion, if it finds a match, it sees if the next word is also a match, and continues on till there s no match.

  1. the cat is on the roof
  2. a man is on the stage

  A1 = [the, cat, is, on, the, roof]
  A2 = [a, man, is, on, the, stage]

  [the]: no match
  [cat]: no match
  [is]: match
  [is, on]: match
  [is, on, the]: match
  [is, on, the, roof]: no match
  [on]: match
  [on, the]: match
  [on, the, roof]: no match
  [the]: match
  [the, roof]: no match
  [roof]: no match
  -end-

推荐答案

在Google上进行的快速搜索显示了

A quick search on Google showed me this website containing the solution to your problem:

通过找到两个词的最长词序来工作 字符串,然后递归地找到最长的序列 字符串的其余部分,直到子字符串没有共同的词. 此时,它会将其余的新单词添加为插入内容,并且 剩下的旧单词作为删除.

It works by finding the longest sequence of words common to both strings, and recursively finding the longest sequences of the remainders of the string until the substrings have no words in common. At this point it adds the remaining new words as an insertion and the remaining old words as a deletion.

这篇关于在两段文字之间找到匹配的短语?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆