在PHP + MySQL中获取流行词 [英] Get Popular words in PHP+MySQL

查看:45
本文介绍了在PHP + MySQL中获取流行词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从PHP/MySQL的多个内容表中获取最受欢迎的单词.

How do I go about getting the most popular words from multiple content tables in PHP/MySQL.

例如,我有一个带有forum post的table forum_post;其中包含一个主题和内容. 除了这些,我还有多个其他表,它们具有不同的字段,其中也可能包含要分析的内容.

For example, I have a table forum_post with forum post; this contains a subject and content. Besides these I have multiple other tables with different fields which could also contain content to be analysed.

我可能会自己去获取所有内容,带(可能)的html将空格上的字符串爆炸.删除引号和逗号等,只需在运行所有单词时保存一个数组即可对不常见的单词进行计数.

I would probably myself go fetch all the content, strip (possible) html explode the string on spaces. remove quotes and comma's etc. and just count the words which are not common by saving an array whilst running through all the words.

我的主要问题是,是否有人知道一种可能更简单或更快速的方法.

My main question is if someone knows of a method which might be easier or faster.

我似乎找不到任何有用的答案,这可能是错误的搜索模式.

I couldn't seem to find any helpful answers about this it might be the wrong search patterns.

推荐答案

有人已经做到了.

您正在寻找的魔术是一个名为 str_word_count( ).

在下面的示例代码中,如果您从中得到很多多余的单词,则需要编写自定义剥离以将其删除.此外,您还希望从单词和其他字符中去除所有html标签.

In my example code below, if you get a lot of extraneous words from this you'll need to write custom stripping to remove them. Additionally you'll want to strip all of the html tags from the words and other characters as well.

我使用与此类似的东西来生成关键字(显然,该代码是专有的).简而言之,我们将使用提供的文本,我们将检查单词的频率,如果单词出现了,那么我们将根据优先级将它们排序在一个数组中.因此,最常出现的单词将在输出中排在第一位.我们不算只出现一次的单词.

I use something similar to this for keyword generation (obviously that code is proprietary). In short we're taking provided text, we're checking the word frequency and if the words come up in order we're sorting them in an array based on priority. So the most frequent words will be first in the output. We're not counting words that only occur once.

<?php
$text = "your text.";

//Setup the array for storing word counts
$freqData = array();
foreach( str_word_count( $text, 1 ) as $words ){
// For each word found in the frequency table, increment its value by one
array_key_exists( $words, $freqData ) ? $freqData[ $words ]++ : $freqData[ $words ] = 1;
}

$list = '';
arsort($freqData);
foreach ($freqData as $word=>$count){
    if ($count > 2){
        $list .= "$word ";
    }
}
if (empty($list)){
    $list = "Not enough duplicate words for popularity contest.";   
}
echo $list;
?>

这篇关于在PHP + MySQL中获取流行词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆