在复杂查询中联接两个表(非统一数据) [英] Joining two tables in a complex query (not uniform data)

查看：102 发布时间：2019/9/19 16:54:29 postgresql join insert data-migration

本文介绍了在复杂查询中联接两个表(非统一数据)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在查询中连接两个表，以便将数据插入到第三个表中(以后将它们用于联接这两个表).我只会在这些表中提及相关列.

I need to connect two tables in a query that I will use to insert data to third table (used in the future to join the two). I will mention only relevant columns in these tables.

PostgreSQL版本9.0.5

PostgreSQL version 9.0.5

表1:data_table

已迁移的数据，大约1万行，相关的列:

migrated data, ca 10k rows, relevant columns:

id(主键)

地址(一个地址，我需要与第二个表匹配的字符串.此地址的长度是可变的.)

address (beginning of an address, string that I need to match with the second table. This address has varying length.)

表2:字典

字典，约900万行，相关列:

dictionary, ca 9 mln rows, relevant columns:

id(主键)

地址(完整地址，我需要与第一个表匹配的字符串，长度也有所不同.)

address (full address, string that I need to match with the first table, varying length as well.)

我到底需要什么

我需要在select语句中正确连接这些表，然后将它们插入第三个表.我需要的是一种成功连接这些表的方法.

I need to correctly connect these tables in a select statement, and then insert these to a third table. All I need is a way to successfully connect these tables.

我要这样做的方法是从data_table中获取每个地址，并将其与以data_table.address开头的字典中的第一个地址(按地址asc排序)连接(不增加记录，因为很多地址)在字典中以每个data_table.address开头).

The way I want to do it is to take each address from data_table, and join it with first address (edit: order by address asc) from dictionary that begins with data_table.address (without multiplying records, as a lot of addresses in dictionary begin with each data_table.address).

此外，两个表中的地址都包含很多不规则空格，因此我们可能需要

Also, addressess in both tables contain a lot of irregular spaces, so we probably need to

replace(address, ' ', '')

两者均

(欢迎其他任何想法).由于字典有900万行，并且服务器运行缓慢，因此可能还会存在一些性能问题.

on both of them (any alternative ideas welcome). There might also be some performance issues since dictionary has 9 mln rows and the server is rather slow.

我认为结果是以下查询的某种变化:

I see the result as some variation of following query:

select 
data_table.id, dictionary_id
from
data_table, dictionary
where
-conditions-

推荐答案

我们的架构师想出的解决方案是编写一个函数来查找第一个匹配项.

The solution that our architect came up with was writing a function to find the first match.

功能:

CREATE OR REPLACE FUNCTION pick_one_address(text)
  RETURNS text AS
$BODY$
DECLARE
  address_query text;
  toFind text;
  found text;
BEGIN

  toFind := (replace($1, ' ', '') || '%');  
  address_query := 'select al.id from dictionary al where replace(al.adres, '' '', '''') like ''' || toFind ||''' limit 1'; 
  EXECUTE address_query into found;
  RETURN found;

RETURN found_address;
END $BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

由于我确实更改了表名以保护公司的隐私，并且没有提及我用来简化问题的第三张表，因此代码可能看起来很奇怪，但是我想它应该足以理解该机制.

The code might seem strange since I did change table names to protect my company's privacy, and didn't mention third table I used to simplify the question, but I guess it should be enough to understand the mechanism.

感谢您的输入@ ErwinBrandstetter，@ CraigRinger

Thanks for your input @ErwinBrandstetter, @CraigRinger

这篇关于在复杂查询中联接两个表(非统一数据)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在复杂查询中联接两个表(非统一数据) [英] Joining two tables in a complex query (not uniform data)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在复杂查询中联接两个表(非统一数据) [英] Joining two tables in a complex query (not uniform data)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭