MySQL SELECT DISTINCT行（不是列）过滤$ _POST作为重复项 [英] MySQL SELECT DISTINCT rows (not columns) to filter $_POST for duplicates

查看：228 发布时间：2017/7/21 19:21:51 mysql distinct duplicate-removal

本文介绍了MySQL SELECT DISTINCT行（不是列）过滤$ _POST作为重复项的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从MySQL表中筛选出所有的 $ _ POST 数据在线表单中存储的行。有时用户的互联网连接停止或浏览器拧紧，表单提交后的新页面不显示（虽然INSERT已经工作，表行已创建）。然后他们点击刷新，并提交两次表单，创建一个重复的行（时间戳和自动增量id列除外）。

I'm trying to filter rows from the MySQL table where all the $_POST data is stored from an online form. Sometimes the user's internet connection stalls or the browser screws up, and the new page after form submission is not displayed (though the INSERT worked and the table row was created). They then hit refresh, and submit their form twice, creating a duplicate row (except for the timestamp and autoincrement id columns).

我想选择唯一的表单提交。这必须是一个非常常见的任务，但我似乎找不到可以使用 DISTINCT 调用的东西，它以简洁的方式应用于除了时间戳和id之外的每一列（类似于 SELECT id，timestamp，DISTINCT everything_else FROM table; 。现在我可以这样做：

I'd like to select unique form submissions. This has to be a really common task, but I can't seem to find something that lets me call with DISTINCT applying to every column except the timestamp and id in a succinct way (sort of like SELECT id, timestamp, DISTINCT everything_else FROM table;. At the moment, I can do:

CREATE TEMPORARY TABLE IF NOT EXISTS temp1 AS (
  SELECT DISTINCT everything,except,id,and,timestamp 
  FROM table1
);
SELECT * FROM table1 LEFT OUTER JOIN temp1 
  ON table1.everything = temp1.everything
  ...
;

我的表格有20k行，大约有25列（用于机器学习练习的分类功能），该查询使用永远（因为我假定它遍历20k行20K次？）我从来没有让它跑完了，这是做什么的标准练习方法？

My table has 20k rows with about 25 columns (classification features for a machine learning exercise). This query takes forever (as I presume it traverses the 20k rows 20K times?) I've never even let it run to completion. What's the standard practice way to do this?

注意：这个问题建议为相关列添加索引，但索引最多可以有16个关键部分。我应该选择最有可能是独一无二的吗？我可以用这种方式在2秒钟内找到大约700个重复项，但是我不能确定不要抛出一个独特的行，因为我在指定索引时也必须忽略一些列。

Note: This question suggests add an index to the relevant columns, but there can be max 16 key parts to an index. Should I just choose the most likely unique ones? I can find about 700 duplicates in 2 seconds this way, but I can't be sure of not throwing away a unique row because I also have to ignore some columns when specifying the index.

MySQL SELECT DISTINCT行（不是列）过滤$ _POST作为重复项 [英] MySQL SELECT DISTINCT rows (not columns) to filter $_POST for duplicates

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

MySQL SELECT DISTINCT行（不是列）过滤$ _POST作为重复项 [英] MySQL SELECT DISTINCT rows (not columns) to filter $_POST for duplicates

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭