计算表列中的字出现次数 [英] Counting word occurrences in a table column

查看:131
本文介绍了计算表列中的字出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含varchar(255)字段的表。我想获得(通过查询,函数或SP)这个表中一组行中每个单词的出现次数。

I have a table with a varchar(255) field. I want to get (via a query, function, or SP) the number of occurences of each word in a group of rows from this table.

如果有2行有以下字段:

If there are 2 rows with these fields:


"I like to eat bananas"
"I don't like to eat like a monkey"

我想获得


    word | count()
    ---------------
    like  3
    eat   2
    to    2
    i     2
    a     1

任何想法?我使用MySQL 5.2。

Any idea? I am using MySQL 5.2.

推荐答案

@Elad Meidar,我喜欢你的问题,我找到了一个解决方案:

@Elad Meidar, I like your question and I found a solution:

SELECT SUM(total_count) as total, value
FROM (

SELECT count(*) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.sentence, ' ', n.n), ' ', -1) value
  FROM table_name t CROSS JOIN 
(
   SELECT a.N + b.N * 10 + 1 n
     FROM 
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
    ORDER BY n
) n
 WHERE n.n <= 1 + (LENGTH(t.sentence) - LENGTH(REPLACE(t.sentence, ' ', '')))
 ORDER BY value

) AS x
GROUP BY x.value

) AS y
GROUP BY value

以下是完整的工作小提琴: http://sqlfiddle.com/#!2/17481a/1

Here is the full working fiddle: http://sqlfiddle.com/#!2/17481a/1

首先,我们执行查询以解释所有词语,如这里 by @peterm(如果你想自定义处理的总字数,请按照他的说明)。然后我们将它转​​换为子查询,然后我们 COUNT GROUP BY 每个单词的值,然后在 GROUP BY 之上进行另一个查询,不分组的情况下可能存在伴有的迹象。即:hello = hello!与 REPLACE

First we do a query to extract all words as explained here by @peterm(follow his instructions if you want to customize the total number of words processed). Then we convert that into a sub-query and then we COUNT and GROUP BY the value of each word, and then make another query on top of that to GROUP BY not grouped words cases where accompanied signs might be present. ie: hello = hello! with a REPLACE

这篇关于计算表列中的字出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆