为什么SQL数据库在命令日志中使用预写日志? [英] Why do SQL databases use a write-ahead log over a command log?

查看:205
本文介绍了为什么SQL数据库在命令日志中使用预写日志?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了有关Voltdb的


传统的回滚期刊
原始未更改的数据库内容转换为单独的回滚日志
文件,然后将更改直接写入数据库文件。



表示提交的特殊记录被附加到WAL的
。因此,COMMIT可以在没有写入
原始数据库的情况下发生,这允许读者从
原始未改变的数据库继续操作,而同时将
改变到WAL中。



命令日志记录ARIES样式日志记录的一个优点是
事务可以在执行开始之前记录,而不是执行
事务并等待日志数据刷新到磁盘。另一个
优点是命令日志所需的IO吞吐量是由用于中继命令的网络限定的
,并且在
Gig-E的情况下,该吞吐量可以通过廉价商品磁盘。


重要的是要记住VoltDB的性质。因此,事务处理有点棘手,性能影响显而易见。



VoltDB博客:VoltDB的新命令记录功能


命令日志VoltDB由存储过程调用和
它们的参数组成。在每个节点上创建一个日志,每个日志都是
复制的,因为所有工作都复制到多个节点。这个
产生一个复制的命令日志,可以在重放
时消除重复。由于VoltDB事务被强烈排序,命令
log也包含排序信息。因此,重播可以以原始事务运行的完全顺序发生
,而VoltDB提供的完整的
事务隔离。由于调用本身
通常小于修改的数据,并且可以在
之前被记录,他们被提交,这种方法对
性能有非常小的影响。这意味着VoltDB用户可以实现同样种类的
平流层性能数字,以及额外的耐久性
保证。



I read about Voltdb's command log. The command log records the transaction invocations instead of each row change as in a write-ahead log. By recording only the invocation, the command logs are kept to a bare minimum, limiting the impact the disk I/O will have on performance.

Can anyone explain the database theory behind why Voltdb uses a command log and why the standard SQL databases such as Postgres, MySQL, SQLServer, Oracle use a write-ahead log?

解决方案

I think it is better to rephrase:

Why does new distributed VoltDB use a command log over write-ahead log?

Let's do an experiment and imagine you are going to write your own storage/database implementation. Undoubtedly you are advanced enough to abstract a file system and use block storage along with some additional optimizations.

Some basic terminology:

  • State : stored information at a given point of time
  • Command : directive to the storage to change its state

So your database may look like the following:

Next step is to execute some command:

Please note several important aspects:

  1. A command may affect many stored entities, so many blocks will get dirty
  2. Next state is a function of the current state and the command

Some intermediate states can be skipped, because it is enough to have a chain of commands instead.

Finally, you need to guarantee data integrity.

  • Write-Ahead Logging - central concept is that State changes should be logged before any heavy update to permanent storage. Following our idea we can log incremental changes for each block.
  • Command Logging - central concept is to log only Command, which is used to produce the state.

There are Pros and Cons for both approaches. Write-Ahead log contains all changed data, Command log will require addition processing, but fast and lightweight.

VoltDB: Command Logging and Recovery

The key to command logging is that it logs the invocations, not the consequences, of the transactions. By recording only the invocation, the command logs are kept to a bare minimum, limiting the impact the disk I/O will have on performance.

Additional notes

SQLite: Write-Ahead Logging

The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file.

A COMMIT occurs when a special record indicating a commit is appended to the WAL. Thus a COMMIT can happen without ever writing to the original database, which allows readers to continue operating from the original unaltered database while changes are simultaneously being committed into the WAL.

PostgreSQL: Write-Ahead Logging (WAL)

Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction.

The log file is written sequentially, and so the cost of syncing the log is much less than the cost of flushing the data pages. This is especially true for servers handling many small transactions touching different parts of the data store. Furthermore, when the server is processing many small concurrent transactions, one fsync of the log file may suffice to commit many transactions.

Conclusion

Command Logging:

  1. is faster
  2. has lower footprint
  3. has heavier "Replay" procedure
  4. requires frequent snapshot

Write Ahead Logging is a technique to provide atomicity. Better Command Logging performance should also improve transaction processing. Databases on 1 Foot

Confirmation

VoltDB Blog: Intro to VoltDB Command Logging

One advantage of command logging over ARIES style logging is that a transaction can be logged before execution begins instead of executing the transaction and waiting for the log data to flush to disk. Another advantage is that the IO throughput necessary for a command log is bounded by the network used to relay commands and, in the case of Gig-E, this throughput can be satisfied by cheap commodity disks.

It is important to remember VoltDB is distributed by its nature. So transactions are a little bit tricky to handle and performance impact is noticeable.

VoltDB Blog: VoltDB’s New Command Logging Feature

The command log in VoltDB consists of stored procedure invocations and their parameters. A log is created at each node, and each log is replicated because all work is replicated to multiple nodes. This results in a replicated command log that can be de-duped at replay time. Because VoltDB transactions are strongly ordered, the command log contains ordering information as well. Thus the replay can occur in the exact order the original transactions ran in, with the full transaction isolation VoltDB offers. Since the invocations themselves are often smaller than the modified data, and can be logged before they are committed, this approach has a very modest effect on performance. This means VoltDB users can achieve the same kind of stratospheric performance numbers, with additional durability assurances.

这篇关于为什么SQL数据库在命令日志中使用预写日志?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆