获取MySQL数据库中重复次数最多的相似字段 [英] Get the most repeated similar fields in MySQL database

查看：407 发布时间：2020/8/11 7:03:06 php mysql sql levenshtein-distance

本文介绍了获取MySQL数据库中重复次数最多的相似字段的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我们有一个像这样的数据库:

Let's assume we have a database like:

Actions_tbl:


--------------------------------------------------------
id | Action_name                              | user_id|
--------------------------------------------------------
1  |  John reads one book                     | 1     
2  |  reading the book by john                | 1
3  |  Joe is jumping over fire                | 2
4  |  reading another book                    | 2
5  |  John reads the book in library          | 1
6  |  Joe read a    book                      | 2
7  |  read a book                             | 3
8  |  jumping with no reason is Ronald's habit| 3

Users_tbl:


-----------------------
user_id |    user_name |
-----------------------
1       |     John
2       |     Joe
3       |     Ronald
4       |     Araz
-----------------------

想知道我是否可以选择重复性最高的类似操作，而不管它的用户是什么，然后用其当前用户替换我自己的user_name！

Wondering if I can choose the most repeated similar action regardless of it's user and replace my own user_name with its current user!

读一本书，读一本书，读另一本书，读图书馆的书，读一本书和读一本书是单词中最常见的单词，因此与阅读该书有关的工作人员重复了6次，系统应随机显示这六个句子之一，并将Araz替换为user_name

Read one book, reading the book, reading another book, read the book in library, read a book and read a book are the ones who have most common WORDS so the staffs related to reading the book is repeated 6 times, my system should show one of those six sentences randomly and replace Araz with user_name

像:阿拉兹(Araz)读这本书

Like: Araz reads the book

我的主意是

select replace(a.action_name , b.user_name) from actions_tbl a, user_tble b where a.user_id = b.user_id group_by

，然后使用

levenshtein()

但是这个根本没有性能！

But this one doesn't have performance at all!

假设我想对一个大数据库和几个不同的表执行相同的操作. 这将破坏我的服务器！！！

Assume that I want to do the same thing for a big db and for few different tables. This will destroy my server!!!

有更好的IDEA吗?

在http://www.artfulsoftware.com/infotree/queries.php#552 levenshtein()函数被实现为MySQL函数，但是首先，您认为它具有足够的性能吗?然后，如何在我的情况下使用它? 也许自连接van可以解决此问题，但是我对sql不太满意！

in http://www.artfulsoftware.com/infotree/queries.php#552 the levenshtein() function is implemented as a MySQL function but firstly, do u think it has enough performance? and then, how to use it in my case? Maybe a self-join van fix this issue but I'm not that good with sql!

* 类似动作，是指具有超过X％常用词的动作

* similar action, are the actions that have more than X% common words

** 详细信息和说明:**

** More information and notes:**

我仅限于PHP和MySQL.

I'm limited to PHP and MySQL.

这只是一个例子，在我的真实项目中，动作是很长的段落.这就是为什么性能是问题的原因.真实的情况是:用户输入了几个项目的项目描述，这些数据可能太相似了(用户的工作区域相同)，我想自动填充(基于先前的填充)下一个项目的描述，以节省时间.

This is just an example, in my real project the actions are long paragraphs. That's why the performance is a matter. The real scenario is: user inputted the description of its project for several projects, those data may be too similar(users would have the same area of work), I want to fill automatically(base on previous fillings) the description of next project, to save time.

如果您有任何务实解决方案，我们将不胜感激.我检查了与NLP相关的解决方案，尽管它们很有趣，但是我认为其中许多解决方案都不准确，无法使用PHP来实现.

I would appreciate if you can have any pragmatical Solution. I checked the NLP related solutions, although they r interesting, but I don't think many of them can be accurate and can be implemented using PHP.

输出应该有意义，并且像所有其他项目一样是适当的段落.这就是为什么我在考虑从以前的选择中进行选择的原因.

The output should make sense and be a proper paragraph like all other projects. That's why I was thinking of choosing from previous ones.

感谢您的明智回答，如果您能对情况有所了解，我们将不胜感激

获取MySQL数据库中重复次数最多的相似字段 [英] Get the most repeated similar fields in MySQL database

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

获取MySQL数据库中重复次数最多的相似字段 [英] Get the most repeated similar fields in MySQL database

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭