从数据库列创建唯一的主键(散列) [英] Create a unique primary key (hash) from database columns
问题描述
我有这张没有主键的表。
我要在新表格中插入一些记录来分析它们,我是考虑用所有可用列的值创建一个新的主键。
如果这是一种像Java这样的编程语言,我会:
int hash = column1 * 31 + column2 * 31 + column3 * 31
或者类似的东西。但是,这是SQL。
如何从可用列的值创建主键?我不能将所有列标记为PK,因为我需要做的是将它们与来自其他数据库表的数据进行比较。
我的表格有3个数字和日期。
编辑我的问题是
我认为需要更多的背景。我很抱歉没有提供它。
我有一个数据库(dm),每天都从另一个数据库(原始数据源)进行更新。它有过去两年的记录。上个月(7月)更新过程中断,一个月没有数据被更新到dm中。在我的Oracle XE中手动创建一个具有相同结构的表,并且将原始数据源中的记录复制到我的数据库(myxe)中,我仅复制7月份的记录创建一个月底需要的报告。
最后在8月8日,更新过程得到了修复,并且等待由此自动过程迁移的记录被复制到数据库中(从originalsource到dm)。
这个过程会在数据被复制后(原始数据)从原始数据源中清除。
一切看起来都不错,但我们刚刚意识到有大量的记录丢失了(大约7月份的25%)
$ b $因此,我想要做的是使用我的备份(myxe)并将缺少的所有记录插入到数据库(dm)中。
这里的问题是:
- 它们没有很好的定义PK。
- 它们位于不同的数据库中。
所以我认为如果我可以从两个表中创建一个唯一的pk,并给出相同的数字,我可以知道哪些丢失并插入他们。
编辑2
所以我在我的本地环境中执行了以下操作: / p>
选择a。* from the_table @ PRODUCTION a,the_table b其中
a.idle = b.idle和
a.activity = b.activity and
a.finishdate = b.finishdate
它返回两个数据库中存在的所有行(.. union)。我有2,000条记录。
接下来我要做的是把它们全部从目标数据库中删除,然后把它们全部从我的数据库插入到目标表中。我希望不会遇到最糟糕的情况: - S: -S
只需创建代理键:
ALTER TABLE mytable ADD pk_col INT
UPDATE mytable
SET pk_col = rownum
ALTER TABLE mytable MODIFY pk_col INT NOT NULL
ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY(pk_col)
或此:
ALTER TABLE mytable ADD pk_col RAW(16)
UPDATE mytable
SET pk_col = SYS_GUID()
ALTER TABLE mytable MODIFY pk_col RAW(16)NOT NULL
ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY(pk_col)
后者使用 GUID
,它们在数据库中是唯一的,但是消耗更多空间并且生成速度慢得多(您的 INSERT
的将是缓慢)
更新:
如果您需要创建相同的 PRIMARY KEY
s在两个具有相同数据的表上使用:
MERGE
INTO mytable v
USING(
SELECT rowid AS rid,rownum AS
FROM mytable
ORDER BY
co1l,col2, col3
)
ON(v.rowid = rid)
当匹配时
UPDATE
SET pk_col = rn
请注意,表格应该与单行相同(即,即
更新2
:
为了解决您的问题,您根本不需要 PK
。
如果您只想选择 dm
中缺失的记录,请使用此记录(在 dm
一侧)
选择*
从mytable @ myxe
减去
选择*
从mytable
这将返回 mytable @ myxe
但不在 mytable @ dm
请注意,它会缩小所有重复项(如果有的话)。
I have this table which doesn't have a primary key.
I'm going to insert some records in a new table to analyze them and I'm thinking in creating a new primary key with the values from all the available columns.
If this were a programming language like Java I would:
int hash = column1 * 31 + column2 * 31 + column3*31
Or something like that. But this is SQL.
How can I create a primary key from the values of the available columns? It won't work for me to simply mark all the columns as PK, for what I need to do is to compare them with data from other DB table.
My table has 3 numbers and a date.
EDIT What my problem is
I think a bit more of background is needed. I'm sorry for not providing it before.
I have a database ( dm ) that is being updated everyday from another db ( original source ) . It has records form the past two years.
Last month ( july ) the update process got broken and for a month there was no data being updated into the dm.
I manually create a table with the same structure in my Oracle XE, and I copy the records from the original source into my db ( myxe ) I copied only records from July to create a report needed by the end of the month.
Finally on aug 8 the update process got fixed and the records which have been waiting to be migrated by this automatic process got copied into the database ( from originalsource to dm ).
This process does clean up from the original source the data once it is copied ( into dm ).
Everything look fine, but we have just realize that an amount of the records got lost ( about 25% of july )
So, what I want to do is to use my backup ( myxe ) and insert into the database ( dm ) all those records missing.
The problem here are:
- They don't have a well defined PK.
- They are in separate databases.
So I thought that If I could create a unique pk from both tables which gave the same number I could tell which were missing and insert them.
EDIT 2
So I did the following in my local environment:
select a.* from the_table@PRODUCTION a , the_table b where
a.idle = b.idle and
a.activity = b.activity and
a.finishdate = b.finishdate
Which returns all the rows that are present in both databases ( the .. union? ) I've got 2,000 records.
What I'm going to do next, is delete them all from the target db and then just insert them all s from my db into the target table
I hope I don't get in something worst : - S : -S
Just create a surrogate key:
ALTER TABLE mytable ADD pk_col INT
UPDATE mytable
SET pk_col = rownum
ALTER TABLE mytable MODIFY pk_col INT NOT NULL
ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)
or this:
ALTER TABLE mytable ADD pk_col RAW(16)
UPDATE mytable
SET pk_col = SYS_GUID()
ALTER TABLE mytable MODIFY pk_col RAW(16) NOT NULL
ALTER TABLE mytable ADD CONSTRAINT pk_mytable_pk_col PRIMARY KEY (pk_col)
The latter uses GUID
's which are unique across databases, but consume more spaces and are much slower to generate (your INSERT
's will be slow)
Update:
If you need to create same PRIMARY KEY
s on two tables with identical data, use this:
MERGE
INTO mytable v
USING (
SELECT rowid AS rid, rownum AS rn
FROM mytable
ORDER BY
co1l, col2, col3
)
ON (v.rowid = rid)
WHEN MATCHED THEN
UPDATE
SET pk_col = rn
Note that tables should be identical up to a single row (i. e. have same number of rows with same data in them).
Update 2
:
For your very problem, you don't need a PK
at all.
If you just want to select the records missing in dm
, use this one (on dm
side)
SELECT *
FROM mytable@myxe
MINUS
SELECT *
FROM mytable
This will return all records that exist in mytable@myxe
but not in mytable@dm
Note that it will shrink all duplicates if any.
这篇关于从数据库列创建唯一的主键(散列)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!