Postgres:不同但仅一栏 [英] Postgres: Distinct but only for one column

查看:78
本文介绍了Postgres:不同但仅一栏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在pgsql上有一张表,上面有名字(有1个以上mio行),但是我也有很多重复项。我选择3个字段: id 名称元数据

I have a table on pgsql with names (having more than 1 mio. rows), but I have also many duplicates. I select 3 fields: id, name, metadata.

我想使用 ORDER BY RANDOM() LIMIT 1000 ,因此我要执行许多步骤以在PHP脚本中节省一些内存。

I want to select them randomly with ORDER BY RANDOM() and LIMIT 1000, so I do this is many steps to save some memory in my PHP script.

但是我该怎么做,所以它只会给我一个列表中没有重复的名称。

But how can I do that so it only gives me a list having no duplicates in names.

例如 [1, Michael Fox, 2003-03-03,34,M,4545 ] 将被返回,但不返回 [2, Michael Fox, 1989-02-23,M,5633] 。名称字段是最重要的,并且每次执行选择时列表中的名称都必须唯一,并且必须是随机的。

For example [1,"Michael Fox","2003-03-03,34,M,4545"] will be returned but not [2,"Michael Fox","1989-02-23,M,5633"]. The name field is the most important and must be unique in the list everytime I do the select and it must be random.

我尝试过使用 GROUP按名称,但是然后它希望我在 GROUP BY 中或者在整合函数中具有ID和元数据,但是我不想

I tried with GROUP BY name, bu then it expects me to have id and metadata in the GROUP BY as well or in a aggragate function, but I dont want to have them somehow filtered.

任何人都知道如何获取许多列,但只对其中一列进行区分吗?

Anyone knows how to fetch many columns but do only a distinct on one column?

推荐答案

仅在一个(或n个)列上进行区分:

To do a distinct on only one (or n) column(s):

select distinct on (name)
    name, col1, col2
from names



name,col1,col2
>这将返回包含名称的任何行。如果要控制要返回的行,则需要订购:

This will return any of the rows containing the name. If you want to control which of the rows will be returned you need to order:

select distinct on (name)
    name, col1, col2
from names
order by name, col1

不同:


SELECT DISTINCT ON(expression [,...])仅保留给定表达式等于的每组行的第一行。使用与ORDER BY相同的规则来解释DISTINCT ON表达式(请参见上文)。请注意,除非使用ORDER BY来确保所需的行首先出现,否则每个集合的第一行都是不可预测的。

SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.

DISTINCT ON表达式必须匹配最左边的ORDER BY表达式。 ORDER BY子句通常会包含其他表达式,这些表达式确定每个DISTINCT ON组中行的期望优先级。

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.

这篇关于Postgres:不同但仅一栏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆