如何根据条件从每一列获取唯一值? [英] How to get unique values from each column based on a condition?

查看:143
本文介绍了如何根据条件从每一列获取唯一值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找一种最佳解决方案,以从每个列中选择唯一值。我的问题是我不知道列名,因为不同的表具有不同的列数。因此,首先,我必须找到列名,并且可以使用以下查询来做到这一点:

I have been trying to find an optimal solution to select unique values from each column. My problem is I don't know column names in advance since different table has different number of columns. So first, I have to find column names and I could use below query to do it:

select column_name from information_schema.columns
where table_name='m0301010000_ds' and column_name like 'c%' 

示例输出列名称:

c1, c2a, c2b, c2c, c2d, c2e, c2f, c2g, c2h, c2i, c2j, c2k, ...

然后我将使用返回列名,以获取每列中的唯一/不同值,而不仅仅是不同行

Then I would use returned column names to get unique/distinct value in each column and not just distinct row.

我知道一种最简单,最糟糕的方法是为表中的每一列写 select Distict column_name,其中column_name ='something' -50次),而且非常耗时。由于每个column_name只能使用一个以上的不同字符,因此我会坚持使用这种老式的解决方案。

I know a simplest and lousy way is to write select distict column_name from table where column_name = 'something' for every single column (around 20-50 times) and its very time consuming too. Since I can't use more than one distinct per column_name, I am stuck with this old school solution.

我相信会有更快,更优雅的方法来实现这个,我只是不知道怎么做。

I am sure there would be a faster and elegant way to achieve this, and I just couldn't figure how. I will really appreciate any help on this.

推荐答案

如果您实时需要此功能,则将不胜感激。使用需要进行全表扫描的SQL对其进行存档。

If you need this in "real time", you won't be able to archive it using a SQL that needs to do a full table scan to archive it.

我建议您为每个列创建一个包含不同值的单独表(并使用@Erwin Brandstetter的SQL进行了初始化;),并使用触发器在原始表上对其进行了维护。

I would advise you to create a separated table containing the distinct values for each column (initialized with SQL from @Erwin Brandstetter ;) and maintain it using a trigger on the original table.

您的新表每个字段将只有一列。行数将等于一个字段的最大不同值数。

Your new table will have one column per field. # of row will be equals to the max number of distinct values for one field.

用于插入:对于要维护的每个字段,请检查该值是否已经存在。如果不是,则添加它。

For on insert: for each field to maintain check if that value is already there or not. If not, add it.

在更新时:对于要保留具有旧值的每个字段,请从新值开始检查是否已存在新值! 。如果没有,请添加它。关于旧值,请检查是否有其他行具有该值,如果没有,请从列表中将其删除(将字段设置为null)。

For on update: for each field to maintain that has old value != from new value, check if the new value is already there or not. If not, add it. Regarding the old value, check if any other row has that value, and if not, remove it from the list (set field to null).

对于delete:对于每个字段,以检查是否有其他行具有该值,如果没有,则将其从列表中删除(将值设置为null)。

For delete : for each field to maintain, check if any other row has that value, and if not, remove it from the list (set value to null).

这种方式主要是移动负载

This way the load mainly moved to the trigger, and the SQL on the value list table will super fast.

PS:请确保从触发器传递所有SQL来解释计划,以确保它们使用最佳索引和执行计划。对于更新/删除,只需检查是否存在旧值(限制1)。

P.S.: Make sure to pass all you SQL from trigger to explain plan to make sure they use best index and execution plan as possible. For update/deletion, just check if old value exists (limit 1).

这篇关于如何根据条件从每一列获取唯一值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆