如何在sql中查找几乎类似的记录? [英] How to find almost similar records in sql?

查看:145
本文介绍了如何在sql中查找几乎类似的记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是搜索记录:

A = {
    field1: value1,
    field2: value2,
    ...
    fieldN: valueN
}

我在数据库中有很多这样的记录。

I have many such records in the database.

其他记录(B)与记录A几乎匹配,即使这些记录中的N-M个字段相等。例如,M = 2:

Other record (B) almost matches record A if even N-M fields in these records are equal. This is the example, M=2:

B = {
    field1: OTHER_value1,
    field2: OTHER_value2,
    field3: value3,
    ...
    fieldN: valueN
}

如果可以是任何字段,不仅是第一个。

If can be any fields, not only the first.

我可以进行非常大的组合sql查询,但是可能有更漂亮的解决方案。

I can make the very big combinatorial sql query, but may be there is more beautiful solution.

P.S .:我的数据库是PostgreSQL。

P.S.: My database is PostgreSQL.

推荐答案

我会使用来处理 NULL 值。

您也可以使用Postgres简写形式来简化逻辑。一种方法是:

You can also use Postgres short-hand to simplify the logic. One way is:

where ( (a.field1 is not distinct from b.field1)::int +
        (a.field2 is not distinct from b.field2)::int +
        . . .
        (a.fieldn is not distinct from b.fieldn)::int +
      ) >= N - M

我认为仅用表示就更容易了M 。因此,仅查看不同的字段:

I think this is easier to express only in terms of M. So, only look at the fields that are different:

where ( (a.field1 is distinct from b.field1)::int +
        (a.field2 is distinct from b.field2)::int +
        . . .
        (a.fieldn is distinct from b.fieldn)::int +
      ) <= M

使用数据进行此操作需要交叉连接这非常昂贵。

Doing this with your data requires a cross join which is quite expensive.

这篇关于如何在sql中查找几乎类似的记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆