从数据库列创建唯一的主键(散列) [英] Create a unique primary key (hash) from database columns

查看:204
本文介绍了从数据库列创建唯一的主键(散列)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这张没有主键的表。



我要在新表格中插入一些记录来分析它们,我是考虑用所有可用列的值创建一个新的主键。



如果这是一种像Java这样的编程语言,我会:

  int hash = column1 * 31 + column2 * 31 + column3 * 31 

或者类似的东西。但是,这是SQL。



如何从可用列的值创建主键?我不能将所有列标记为PK,因为我需要做的是将它们与来自其他数据库表的数据进行比较。



我的表格有3个数字和日期。

编辑我的问题是



我认为需要更多的背景。我很抱歉没有提供它。

我有一个数据库(dm),每天都从另一个数据库(原始数据源)进行更新。它有过去两年的记录。上个月(7月)更新过程中断,一个月没有数据被更新到dm中。在我的Oracle XE中手动创建一个具有相同结构的表,并且将原始数据源中的记录复制到我的数据库(myxe)中,我仅复制7月份的记录创建一个月底需要的报告。

最后在8月8日,更新过程得到了修复,并且等待由此自动过程迁移的记录被复制到数据库中(从originalsource到dm)。

这个过程会在数据被复制后(原始数据)从原始数据源中清除。



一切看起来都不错,但我们刚刚意识到有大量的记录丢失了(大约7月份的25%)
$ b $因此,我想要做的是使用我的备份(myxe)并将缺少的所有记录插入到数据库(dm)中。



这里的问题是:


  • 它们没有很好的定义PK。

  • 它们位于不同的数据库中。



所以我认为如果我可以从两个表中创建一个唯一的pk,并给出相同的数字,我可以知道哪些丢失并插入他们。

编辑2



所以我在我的本地环境中执行了以下操作: / p>

 选择a。* from the_table @ PRODUCTION a,the_table b其中
a.idle = b.idle和
a.activity = b.activity and
a.finishdate = b.finishdate

它返回两个数据库中存在的所有行(.. union)。我有2,000条记录。



接下来我要做的是把它们全部从目标数据库中删除,然后把它们全部从我的数据库插入到目标表中。我希望不会遇到最糟糕的情况: - S: -S

解决方案

只需创建代理键:

  ALTER TABLE mytable ADD pk_col INT 

UPDATE mytable
SET pk_col = rownum

ALTER TABLE mytable MODIFY pk_col INT NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY(pk_col)

或此:

  ALTER TABLE mytable ADD pk_col RAW(16)

UPDATE mytable
SET pk_col = SYS_GUID()

ALTER TABLE mytable MODIFY pk_col RAW(16)NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY(pk_col)

后者使用 GUID ,它们在数据库中是唯一的,但是消耗更多空间并且生成速度慢得多(您的 INSERT 的将是缓慢)



更新:



如果您需要创建相同的 PRIMARY KEY s在两个具有相同数据的表上使用:

  MERGE 
INTO mytable v
USING(
SELECT rowid AS rid,rownum AS
FROM mytable
ORDER BY
co1l,col2, col3

ON(v.rowid = rid)
当匹配时
UPDATE
SET pk_col = rn

请注意,表格应该与单行相同(即,即

更新2



为了解决您的问题,您根本不需要 PK

如果您只想选择 dm 中缺失的记录,请使用此记录(在 dm 一侧)

 选择* 
从mytable @ myxe
减去
选择*
从mytable

这将返回 mytable @ myxe 但不在 mytable @ dm



请注意,它会缩小所有重复项(如果有的话)。


I have this table which doesn't have a primary key.

I'm going to insert some records in a new table to analyze them and I'm thinking in creating a new primary key with the values from all the available columns.

If this were a programming language like Java I would:

 int hash = column1 * 31 + column2 * 31 + column3*31 

Or something like that. But this is SQL.

How can I create a primary key from the values of the available columns? It won't work for me to simply mark all the columns as PK, for what I need to do is to compare them with data from other DB table.

My table has 3 numbers and a date.

EDIT What my problem is

I think a bit more of background is needed. I'm sorry for not providing it before.

I have a database ( dm ) that is being updated everyday from another db ( original source ) . It has records form the past two years.

Last month ( july ) the update process got broken and for a month there was no data being updated into the dm.

I manually create a table with the same structure in my Oracle XE, and I copy the records from the original source into my db ( myxe ) I copied only records from July to create a report needed by the end of the month.

Finally on aug 8 the update process got fixed and the records which have been waiting to be migrated by this automatic process got copied into the database ( from originalsource to dm ).

This process does clean up from the original source the data once it is copied ( into dm ).

Everything look fine, but we have just realize that an amount of the records got lost ( about 25% of july )

So, what I want to do is to use my backup ( myxe ) and insert into the database ( dm ) all those records missing.

The problem here are:

  • They don't have a well defined PK.
  • They are in separate databases.

So I thought that If I could create a unique pk from both tables which gave the same number I could tell which were missing and insert them.

EDIT 2

So I did the following in my local environment:

select a.* from the_table@PRODUCTION a , the_table b where
a.idle = b.idle and 
a.activity = b.activity and 
a.finishdate = b.finishdate

Which returns all the rows that are present in both databases ( the .. union? ) I've got 2,000 records.

What I'm going to do next, is delete them all from the target db and then just insert them all s from my db into the target table

I hope I don't get in something worst : - S : -S

解决方案

Just create a surrogate key:

ALTER TABLE mytable ADD pk_col INT

UPDATE  mytable
SET     pk_col = rownum

ALTER TABLE mytable MODIFY pk_col INT NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)

or this:

ALTER TABLE mytable ADD pk_col RAW(16)

UPDATE  mytable
SET     pk_col = SYS_GUID()

ALTER TABLE mytable MODIFY pk_col RAW(16) NOT NULL

ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)

The latter uses GUID's which are unique across databases, but consume more spaces and are much slower to generate (your INSERT's will be slow)

Update:

If you need to create same PRIMARY KEYs on two tables with identical data, use this:

MERGE
INTO    mytable v
USING   (
        SELECT  rowid AS rid, rownum AS rn
        FROM    mytable
        ORDER BY
                co1l, col2, col3
        )
ON      (v.rowid = rid)
WHEN MATCHED THEN
UPDATE
SET     pk_col = rn

Note that tables should be identical up to a single row (i. e. have same number of rows with same data in them).

Update 2:

For your very problem, you don't need a PK at all.

If you just want to select the records missing in dm, use this one (on dm side)

SELECT  *
FROM    mytable@myxe
MINUS
SELECT  *
FROM    mytable

This will return all records that exist in mytable@myxe but not in mytable@dm

Note that it will shrink all duplicates if any.

这篇关于从数据库列创建唯一的主键(散列)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆