如何确定数据库完好无损? [英] How to determine a database is intact?

查看:63
本文介绍了如何确定数据库完好无损?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每晚,我们关闭数据库并进行文件系统备份。


我们数据库问题的简短年表:


8/21 - 用户表的计数(*)成功(每周一次完成获得

统计数据)


8/23 - 具体搜索特定价值(超过200万中的一个)

导致邮政局长到SEGV。我删除了有问题的索引并重建了它。

一切都好了。


8/28 - 计数(*)失败 - postmaster在SEGV上中止。我尝试了一个

pg_dumpall。 Postmaster用SEGV中止大约240 MB进入转储(

数据库大约50 GB)。


我重新加载了8/25数据库并尝试了pg_dumpall。同样的失败。


我从8/21文件系统转储重新加载。 Pg_dumpall参与了这个工作。

我花了接下来的差不多48小时倾倒并重新装载了2.4亿行,并且b / b
重新处理了自8月21日以来的数百万次添加。我不敢使用

8/21数据库而不重新加载因为我不知道它是好还是

没有。

问题:


我们如何判断数据库是否完整?在上面的例子中,pg_dumpall

在8/21数据库上工作。它在8月21日和8月23日之间变得腐败了吗?

还是它已经腐败而变得更糟? Pg_dumpall没有告诉你

索引的条件。一个腐败的索引会破坏数据块吗?


我正在寻找每周做一次pg_dumpall以便我们有一个点

及时我们知道我们有一个可恢复的数据库。当数据库

达到几百GB且超过十亿行时,这不是一个很好的解决方案,而且并不能解决整个数据库的完整性问题。 br />

回到原来的问题...如何验证

a数据库的完整完整性 - 尤其是重载或完整索引的非常大的数据库/>
重建可能需要几天?


Wes

-------------- -------------(广播结束)---------------------------

提示9:如果您的

加入列的数据类型不匹配,计划员将忽略您选择索引扫描的愿望

On a nightly basis, we shut the database down and do a file system backup.

A short chronology of our database problem:

8/21 - count(*) of user tables succeeded (done once a week to get
statistics)

8/23 - A specific search on a specific value (one out of over 2 million)
caused postmaster to SEGV. I dropped the index in question and rebuilt it.
All appeared ok.

8/28 - count(*) failed - postmaster aborted on a SEGV. I attempted a
pg_dumpall. Postmaster aborted with SEGV about 240 MB into the dump (the
database is about 50 GB).

I reloaded 8/25 database and attempted a pg_dumpall. Same failure.

I reloaded from the 8/21 file system dump. Pg_dumpall worked on this one.
I spent the next almost 48 hours dumping and reloading 240 million rows and
reprocessing the several million additions since 8/21. I didn''t dare use
the 8/21 database without reloading because I didn''t know if it was good or
not.
Question:

How can we tell that a database is intact? In the above example, pg_dumpall
worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23,
or was it already corrupt and got worse? Pg_dumpall tells you nothing about
the condition of indexes. Could a corrupt index corrupt data blocks?

I''m looking at doing a pg_dumpall on a weekly basis so that we have a point
in time where we know we have a recoverable database. When the database
reaches several hundred GB and over over a billion rows, this isn''t a great
solution, and doesn''t address the overall database integrity.

Back to the original question... How can I verify the complete integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?

Wes
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match

推荐答案

嗯。我每晚都会转储我们的生产数据库,在场外存档一份副本,并通过运行一点

验证脚本来验证转储的质量。恢复。


但如果这需要你花一天多的时间才能做到这一点,我不确定。


-tfo


2004年9月2日下午3:30,Wes写道:
Hmm. I do a nightly dump of our production database, archive a copy
offsite, and verify the quality of the dump by running a little
verification script that is little more than a restore.

But if it would take you more than a day to do that, I''m not sure.

-tfo

On Sep 2, 2004, at 3:30 PM, Wes wrote:
返回原始问题...如何验证完整的
数据库的完整性 - 特别是一个非常大的数据库,重新加载或完整索引重建可能需要几天?

Wes
Back to the original question... How can I verify the complete
integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?

Wes



---------------------------(广播结束)-------- -------------------

提示9:如果您的
$ b $,计划者将忽略您选择索引扫描的愿望b加入列的数据类型不匹配


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match


Wes写道:
每晚,我们关闭数据库关闭并进行文件系统备份。

我们的数据库问题的简短年表:
[snip]问题:

我们如何判断数据库是否完好无损?在上面的例子中,pg_dumpall
在8/21数据库上工作。它是否在8/21和8/23之间腐败,
还是已经腐败并变得更糟? Pg_dumpall没有告诉你
索引的条件。一个腐败的索引会破坏数据块吗?

我正在考虑每周做一次pg_dumpall,以便我们及时知道我们有一个可恢复的数据库。当数据库达到几百GB并超过十亿行时,这不是一个很好的解决方案,并没有解决整个数据库的完整性问题。

返回原始问题......如何验证数据库的完整完整性 - 尤其是重载或完整索引重建可能需要几天的非常大的数据库?
On a nightly basis, we shut the database down and do a file system backup.

A short chronology of our database problem: [snip] Question:

How can we tell that a database is intact? In the above example, pg_dumpall
worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23,
or was it already corrupt and got worse? Pg_dumpall tells you nothing about
the condition of indexes. Could a corrupt index corrupt data blocks?

I''m looking at doing a pg_dumpall on a weekly basis so that we have a point
in time where we know we have a recoverable database. When the database
reaches several hundred GB and over over a billion rows, this isn''t a great
solution, and doesn''t address the overall database integrity.

Back to the original question... How can I verify the complete integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?




你不需要验证任何东西。 PG的工作是永远不会破坏你的
数据,提供你的硬件应该是好的。如果你几乎每天都会遇到问题,这会产生RAM /磁盘问题,这对我来说是个问题(第11页通常意味着RAM)。不能保证它不是PG而是它的可靠性记录很可靠。


步骤我会采取:

1.根据发行说明检查您的版本号,看看您是否应该升级
。你没有提到你的版本,但它总是值得

有最后一个点发布(7.2.5,7.3.7,7.4.5)

2.安排对硬件运行内存/磁盘测试的时间。查找

48小时可能并不容易,但您需要知道自己的位置。

3.设置slony或其他一些复制,以便我可以安排我的停机时间。 />

-

Richard Huxton

Archonet Ltd


------ ---------------------(广播结束)------------------------ ---

提示8:解释分析是你的朋友



You shouldn''t have to verify anything. PG''s job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can''t guarantee it''s not PG but it''s
record of reliability is pretty good.

Steps I''d take:
1. Check your version number against the release notes and see if you
should upgrade. You don''t mention your version, but it''s always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


在9/3/04 3:11 AM, ;理查德胡克斯顿 < de*@archonet.com>写道:
On 9/3/04 3:11 AM, "Richard Huxton" <de*@archonet.com> wrote:
你不必验证任何东西。 PG的工作是永远不会破坏你的数据,提供你的硬件应该是好的。如果你几乎每天都会遇到问题,那就会给我提出RAM /磁盘问题(sig 11通常意味着RAM)。不能保证它不是PG,但它的可靠性记录非常好。


我认为SEGV通常只表示它取消引用了一个坏指针(即

NULL或超出范围)。问题不是每天都在发生。

数据库自去年12月开始投入使用。这只是因为症状从没有明显的症状发展到明显腐败的数据库。

我的猜测是一些轻微的腐败,直到数据库无法使用't

甚至被抛弃。

步骤我会采取:
1.根据发行说明检查你的版本号,看看你是否应该升级。你没有提到你的版本,但它总是值得
有最后一个点发布(7.2.5,7.3.7,7.4.5)
2.安排运行内存的时间/ disk测试您的硬件。查找
48小时可能并不容易,但您需要知道自己的位置。
3.设置slony或其他复制,以便我可以安排我的停机时间。
You shouldn''t have to verify anything. PG''s job is to never corrupt your
data, and providing your hardware is good it should do so. If you are
getting problems almost daily that would suggest a RAM/disk problem to
me (sig 11 usually implies RAM). Can''t guarantee it''s not PG but it''s
record of reliability is pretty good.
I believe SEGV typically just indicates it de-referenced a bad pointer (i.e.
NULL or out of range). The problem is not occurring on a daily basis. The
database has been in service since December of last year. It''s just that
the symptoms progressed from no apparent symptoms, to a clearly corrupt DB.
My guess is that some minor corruption fed upon itself until the DB couldn''t
even be dumped.
Steps I''d take:
1. Check your version number against the release notes and see if you
should upgrade. You don''t mention your version, but it''s always worth
having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding
48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.




我以为我提到了原始邮件中的级别 - 7.4.1。我们计划运行一些诊断程序。


是否有PostgreSQL中的错误,或者有内存命中,或者其他什么

对原始问题并不重要。数据库可能会损坏。如何判断数据库在任何给定点完好无损

及时?如果我在已知的损坏之前从系统备份重新加载,我怎么能确定导致失败的原始损坏是否仍然存在并且将再次支持丑陋的脑袋?


Wes

-------------------------- - (广播结束)---------------------------

提示2:你可以下车所有名单使用取消注册命令

(发送取消注册YourEmailAddressHere到 ma ****** *@postgresql.org



I thought I mentioned the level in my original mail - 7.4.1. We are
planning on running some diagnostics.

Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever
doesn''t really matter to the original question. The database can become
corrupt. How can I tell that a database is fully intact at any given point
in time? If I reload from a system backup before the known corruption, how
can I be sure that the original corruption that precipitated the failure is
not still there and will again rear its ugly head?

Wes
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


这篇关于如何确定数据库完好无损?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆