如何找到在PHP中两个字符串之间的最大公共子串? [英] How can I find the Largest Common Substring between two strings in PHP?

查看:206
本文介绍了如何找到在PHP中两个字符串之间的最大公共子串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个快速算法在两个字符串的最大公共子串或者是它的一个NPComplete问题?

Is there a fast algorithm for finding the Largest Common Substring in two strings or is it an NPComplete problem?

在PHP中,我可以找到一个大海捞针:

In PHP, I can find a needle in a haystack:

<?php

if (strstr("there is a needle in a haystack", "needle")) {
    echo "found<br>\n";
}
?>

我想我能做到这一点的一个遍历字符串中的一个,但是这将是非常昂贵的!特别是因为我的这个应用程序来搜索电子邮件的数据库,并寻找垃圾邮件(由同一人发送,即类似于电子邮件)。

I guess I could do this in a loop over one of the strings but that would be very expensive! Especially since my application of this is to search a database of email and look for spam (i.e. similar emails sent by the same person).

有没有人有任何PHP code,他们可以扔在那里?

Does anyone have any PHP code they can throw out there?

推荐答案

因为我已经发现的有关维基百科文章。这不是一个NP完全问题,它可以在O(MN)时间来完成使用动态规划算法。

I have since found a relevant wikipedia article. It is not a NP complete problem, it can be done in O(mn) time using a dynamic programming algorithm.

在PHP中,我发现了 similar_text 功能非常有用。这里的一个code样品以检索一系列通过它们文本电子邮件和循环并发现那些是90%彼此相似。 注意:这样的事情是不可扩展

In PHP I found the similar_text function very useful. Here's a code sample to retrieve a series of text emails and loop through them and find ones that are 90% similar to each other. Note: Something like this is NOT scalable:

<?php
// Gather all messages by a user into two identical associative arrays
$getMsgsRes = mysql_query(SELECT * FROM email_messages WHERE from = '$someUserID');
while($msgInfo = mysql_fetch_assoc($getMsgsRes))
{
    $msgsInfo1[] = $msgInfo;
    $msgsInfo2[] = $msgInfo;
}

// Loop over msgs and compare each one to every other
foreach ($msgsInfo1 as $msg1)
    foreach ($msgsInfo2 as $msg2)
    	similar_text($msg1['msgTxt'],$msg2['msgTxt'],$similarity_pst);
    	if ($similarity_pst > 90)
    		echo "{$msg1['msgID']} is ${similarity_pst}% to {$msg2['msgID']}\n";
?>

这篇关于如何找到在PHP中两个字符串之间的最大公共子串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆