PHP:在文本中查找有空格和无空格的重复单词 [英] PHP : Find repeated words with and without space in text

查看:186
本文介绍了PHP:在文本中查找有空格和无空格的重复单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



  $ str ='bob是个好人。玛丽是一个好人。谁是最好的?你是个好人吗?鲍勃是最好的? 
函数重复($ str)
{
$ str = trim($ str);
$ str = ereg_replace('[[:space:]] +','',$ str);
$ words = explode('',$ str);
foreach($ words为$ w)
{
$ wordstats [($ w)] ++;

foreach($ wordstats as $ k => $ v)
{
if($ v> = 2)
{
print $ k。,;





这就是我喜欢的结果:


$ b $ / p>

  bob,good,person,is,a,the,best? 

问:我如何得到结果重复的单词和空格之间的多部分单词: p>

  bob,good,person,is,a,the,best? ,好人,是一个好的,是的,bob是


解决方案

 <?php 
$ str ='bob是一个好人。玛丽是一个好人。谁是最好的?你是个好人吗?鲍勃是最好的?

//全部单词:
$ found = str_word_count(strtolower($ str),1);
//得到所有超过1的单词
$ counts = array_count_values($ found);
$ repeated = array_keys(array_filter($ counts,function($ a){return $ a> 1;}));
//以1个字的组开始结果。
$ results = $ repeated;
while($ word = array_shift($ found)){
if(!in_array($ word,$ repeated))continue;
$ additions = array();
while($ add = array_shift($ found)){
if(!in_array($ add,$ repeated))break;
$ additions [] = $ add;
$ count = preg_match_all('/'。preg_quote($ word)。'\W +'。implode('\W +',$ additions)。'/ si',$ str,$ matches);
if($ count> 1){
$ newmatch = $ word。''.implode('',$ additions);
if(!in_array($ newmatch,$ results))$ results [] = $ newmatch;
} else {
break;


if(!empty($ additions))array_splice($ found,0,0,$ additions);
}
var_dump($ results);

收益率:

  array(17){
[0] =>
string(3)bob
[1] =>
string(2)是
[2] =>
string(1)a
[3] =>
string(4)good
[4] =>
string(6)person
[5] =>
string(3)the
[6] =>
string(4)best
[7] =>
string(6)bob is
[8] =>
string(4)是
[9] =>
string(9)是一个好的
[10] =>
string(16)是个好人
[11] =>
string(6)good
[12] =>
字符串(13)好人
[13] =>
string(11)good person
[14] =>
string(6)是
[15] =>
string(11)是最好的
[16] =>
string(8)best
}


I can find repeated words in text with this function:

$str = 'bob is a good person. mary is a good person. who is the best? are you a good person? bob is the best?';
    function repeated($str)
    {
        $str=trim($str);  
        $str=ereg_replace('[[:space:]]+', ' ',$str);  
        $words=explode(' ',$str);  
        foreach($words as $w)  
        {  
        $wordstats[($w)]++;  
        }  
        foreach($wordstats as $k=>$v)  
        {  
            if($v>=2)  
            {  
                print "$k"." , ";  
            }  
        }  
    }

thats result me like :

bob , good , person , is , a , the , best?

Q : how i can get result repeated words and Multi-part words between space look like :

bob , good , person , is , a , the , best? , good person , is a , a good , is the , bob is

解决方案

<?php
$str = 'bob is a good person. mary is a good person. who is the best? are you a good person? bob is the best?';

//all words:
$found = str_word_count(strtolower($str),1);
//get all words with occurance of more then 1
$counts = array_count_values($found);
$repeated = array_keys(array_filter($counts,function($a){return $a > 1;}));
//begin results with the groups of 1 word.
$results = $repeated;
while($word = array_shift($found)){
    if(!in_array($word,$repeated)) continue;
    $additions = array();
    while($add = array_shift($found)){
        if(!in_array($add,$repeated)) break;
        $additions[] = $add;
        $count = preg_match_all('/'.preg_quote($word).'\W+'.implode('\W+',$additions).'/si',$str,$matches);
        if($count > 1){
            $newmatch = $word.' '.implode(' ',$additions);
            if(!in_array($newmatch,$results)) $results[] = $newmatch;
        } else {
            break;
        }
    }
    if(!empty($additions)) array_splice($found,0,0,$additions);
}
var_dump($results);

Yields:

array(17) {
  [0]=>
  string(3) "bob"
  [1]=>
  string(2) "is"
  [2]=>
  string(1) "a"
  [3]=>
  string(4) "good"
  [4]=>
  string(6) "person"
  [5]=>
  string(3) "the"
  [6]=>
  string(4) "best"
  [7]=>
  string(6) "bob is"
  [8]=>
  string(4) "is a"
  [9]=>
  string(9) "is a good"
  [10]=>
  string(16) "is a good person"
  [11]=>
  string(6) "a good"
  [12]=>
  string(13) "a good person"
  [13]=>
  string(11) "good person"
  [14]=>
  string(6) "is the"
  [15]=>
  string(11) "is the best"
  [16]=>
  string(8) "the best"
}

这篇关于PHP:在文本中查找有空格和无空格的重复单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆