mySQL>>在逗号分隔的字段中查找最常用的单词 [英] mySQL >> Finding the most frequently used words in a comma-delimited field

查看:134
本文介绍了mySQL>>在逗号分隔的字段中查找最常用的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的每个记录都有一个名为RES_Tags的关键字字段。该表为资源。



RES_Tags字段包含该记录的以逗号分隔的关键字列表。



EX:

 劳工,工作,工会,有组织的劳工,罢工,纠察,抵制

code>

可以使用什么SQL查询找出最常用的30个标签?

我看到了这个相关主题用逗号计算流行标签在MySQL上分隔字段,但我希望有人找到了一个方法,因为这个问题最初被问到。

或者,这个问题的原因不是重复的,如果不可能使用SQL查询来完成我所要求的,唯一的方法是规范化,将现有的逗号分隔列表转换为标签表和标签到资源表的最佳方式是什么? 解决方案

实际上,您可以从MySQL中的逗号分隔术语列表中提取单个术语。它非常令人讨厌,而且它需要知道任何行中出现的最大术语数。 SUBSTRING_INDEX()函数是这是它的关键。



假设你在一个领域从来没有超过五个术语。然后这个查询得到你所有的条件。

pre $ SELECT $ $(
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags (SUBSTRING_INDEX(CONCAT(RES_Tags,','),','),',',1),',',-1))term FROM资源
UNION ALL
SELECT TRIM(SUBSTRING_INDEX (SUBSTRING_INDEX(CONCAT(RES_Tags,','),',',3) ,',',-1))term FROM资源
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','),',',4),',', - 1))term FROM资源
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','),',',5),',',-1))term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','),',',6),',',-1))术语FROM资源
)术语
长度(期限)> 0

如果您的最多学期数超过5,您可以在联合中添加更多术语。



编辑你应该正常化吗?是的,你应该正常化。你可以使用这种查询来创建表格的标准化版本。是。这里有一些关于如何使用的提示。



找出现在最长的记录中有多少个标签。加两个。写这种查询来支持这个数字。将它用作 CREATE TABLE标记AS SELECT ... 查询的一部分。不要回头。

I have a keywords field for each of my records called "RES_Tags". The table is "Resources".

The "RES_Tags" field contain a comma-delimited list of keywords for that record.

EX:

labor, work, unions, organized labor, strike, picket, boycott

What SQL query can I use to find out the 30 most frequently used tags?

I saw this related thread Count popular tags with comma delimited field on MySQL, but I'm hoping that someone has found a way since this question was originally asked.

Alternately, and the reason this questions isn't a duplicate, if it is impossible to use a SQL query to do what I'm asking and the only way is to normalize, what would be the best way to convert the existing comma-delimited lists into a Tags table and Tags-to-Resources table?

解决方案

You can actually extract individual "terms" from a comma-separated list of terms in MySQL. It's incredibly nasty, and it requires knowing the maximum number of terms that will appear in any row. The SUBSTRING_INDEX() function is the key to it.

Let's say you never have more than five terms in a field. Then this query gets all your terms.

SELECT term FROM(
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',1), ',', -1)) term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',2), ',', -1)) term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',3), ',', -1)) term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',4), ',', -1)) term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',5), ',', -1)) term FROM Resources
UNION ALL
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(RES_Tags,','), ',',6), ',', -1)) term FROM Resources
) terms
WHERE LENGTH(term) > 0

You can just put more terms in the union if you have a maximum term count more than five.

Edit Should you normalize? Yes you should normalize. Can you use this kind of query to create a normalized version of your table. Yes. Here's some hints about how.

Figure out how many tags are in the longest record you have now. Add two. Write this sort of query to support that number. Use it as part of a CREATE TABLE tags AS SELECT... query. Don't look back.

这篇关于mySQL>>在逗号分隔的字段中查找最常用的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆