了解Oracle中的ORA_ROWSCN行为 [英] Understanding the ORA_ROWSCN behavior in Oracle

查看:86
本文介绍了了解Oracle中的ORA_ROWSCN行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,这实质上是有关解决方案

默认情况下,ORA_ROWSCN存储在块级别,而不存储在行级别.如果表最初是在启用ROWDEPENDENCIES的情况下构建的,则仅将其存储在行级别.假设您可以在一个块中容纳表的许多行,并且您没有使用APPEND提示将新数据插入表的现有高水位线上方,则可能会将新数据插入到已经有一些现有的数据了.默认情况下,这将更改块中每一行的ORA_ROWSCN,从而使您的查询计算的行数超过实际插入的行数.

由于仅保证ORA_ROWSCN是最后一次在一行上存在DML时的上限,所以通过将CREATE_DATE列添加到行中来确定今天插入了多少行会更加普遍. INSERT运行后默认为SYSDATE或依赖于SQL%ROWCOUNT的表(当然,假设您正在使用单个INSERT语句插入所有行).

通常,即使表是使用ROWDEPENDENCIES构建的,使用ORA_ROWSCNSCN_TO_TIMESTAMP函数也会成为识别何时插入行的有问题的方法. ORA_ROWSCN返回一个Oracle SCN,它是一个系统更改号.这是特定更改(即交易)的唯一标识符.因此,SCN与时间之间没有直接的联系-我的数据库生成SCN的速度可能比您的数据库快一百万倍,而我的SCN 1可能与您的SCN 1相差数年.Oracle后台进程SMON保持一个将SCN值映射到近似时间戳的表,但它仅将数据保留有限的时间-否则,您的数据库最终将产生数十亿行表,该表仅存储SCN到时间戳的映射.如果该行的插入时间超过一周(并且确切的限制取决于数据库和数据库版本),则SCN_TO_TIMESTAMP将无法将SCN转换为时间戳,并且将返回错误. >

So this is essentially a follow-up question on Finding duplicate records.

We perform data imports from text files everyday and we ended up importing 10163 records spread across 182 files twice. On running the query mentioned above to find duplicates, the total count of records we got is 10174, which is 11 records more than what are contained in the files. I assumed about the posibility of 2 records that are exactly the same and are valid ones being accounted for as well in the query. So I thought it would be best to use a timestamp field and simply find all the records that ran today (and hence ended up adding duplicate rows). I used ORA_ROWSCN using the following query:

select count(*) from my_table
where TRUNC(SCN_TO_TIMESTAMP(ORA_ROWSCN)) = '01-MAR-2012'
;

However, the count is still more i.e. 10168. Now, I am pretty sure that the total lines in the file is 10163 by running the following command in the folder that contains all the files. wc -l *.txt.

Is it possible to find out which rows are actually inserted twice?

解决方案

By default, ORA_ROWSCN is stored at the block level, not at the row level. It is only stored at the row level if the table was originally built with ROWDEPENDENCIES enabled. Assuming that you can fit many rows of your table in a single block and that you're not using the APPEND hint to insert the new data above the existing high water mark of the table, you are likely inserting new data into blocks that already have some existing data in them. By default, that is going to change the ORA_ROWSCN of every row in the block causing your query to count more rows than were actually inserted.

Since ORA_ROWSCN is only guaranteed to be an upper-bound on the last time there was DML on a row, it would be much more common to determine how many rows were inserted today by adding a CREATE_DATE column to the table that defaults to SYSDATE or to rely on SQL%ROWCOUNT after your INSERT ran (assuming, of course, that you are using a single INSERT statement to insert all the rows).

Generally, using the ORA_ROWSCN and the SCN_TO_TIMESTAMP function is going to be a problematic way to identify when a row was inserted even if the table is built with ROWDEPENDENCIES. ORA_ROWSCN returns an Oracle SCN which is a System Change Number. This is a unique identifier for a particular change (i.e. a transaction). As such, there is no direct link between a SCN and a time-- my database might be generating SCN's a million times more quickly than yours and my SCN 1 may be years different from your SCN 1. The Oracle background process SMON maintains a table that maps SCN values to approximate timestamps but it only maintains that data for a limited period of time-- otherwise, your database would end up with a multi-billion row table that was just storing SCN to timestamp mappings. If the row was inserted more than, say, a week ago (and the exact limit depends on the database and database version), SCN_TO_TIMESTAMP won't be able to convert the SCN to a timestamp and will return an error.

这篇关于了解Oracle中的ORA_ROWSCN行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆