为什么 SQLite 在特定机器上如此缓慢(~2 q/s)? [英] Why is SQLite so slow (~2 q/s) on a specific machine?

查看:60
本文介绍了为什么 SQLite 在特定机器上如此缓慢(~2 q/s)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的其中一台服务器(i7 Ivy Core、32 GB RAM、Debian 6 @ 64 位、PHP 5.4.10)上,我遇到了 SQLite 插入速度极慢的情况.以下测试程序仅报告 2.2 次插入/秒(插入 30 行需要 14 秒).

On one of my servers (i7 Ivy Core, 32 GB RAM, Debian 6 @ 64bit, PHP 5.4.10) I experience extremely slow inserts with SQLite. The following test program reports just 2.2 inserts/second (14 seconds for inserting 30 rows).

unlink("test.db");

$db = new PDO('sqlite:test.db');

$db->exec("CREATE TABLE test (dummy INT)");

$count = 30;

$t = microtime(true);
for ($i=0; $i<$count; $i++) {
  $db->exec("INSERT INTO test VALUES ($i)")
   or die("SQLite error: ".$db->errorInfo()[2]."\n");
}
$elapsed = microtime(true)-$t;
echo sprintf("%d inserts in %.3f secs (%.1f q/s)\n", 
  $count, $elapsed, $count/$elapsed);

输出:

$ time php test.php
30 inserts in 13.911 secs (2.2 q/s)

real    0m14.634s
user    0m0.004s
sys     0m0.016s

我知道我可以通过在插入语句周围包裹 BEGIN/END 来加速这个过程(这给了我 200000 q/s),但即使没有事务,这也应该是快得多.在其他(较旧的)机器(相同的 PHP 版本)上,我在没有显式事务的情况下达到了 300 多个查询/秒.

I know that I can accelerate this by wrapping BEGIN/END around the insert statements (which gives me 200000 q/s) but even without a transaction this should be much faster. On other (older) machines (same PHP version) I reach 300+ queries/sec without explicit transactions.

这可能是什么原因?我是否必须调整 Sqlite 或 O/S?

What could be the cause for this? Do I have to tune Sqlite or the O/S?

推荐答案

我在 Linux 64 位机器上使用 strace -C -tt 做了一个类似的测试,以了解 SQLite3 的发展方向时间.

I have done a similar test on a Linux 64bit machine using strace -C -tt to have an idea of where SQLite3 is taking time.

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.03    0.004000          32       124           fsync
  0.64    0.000026           0       222           mprotect
  0.32    0.000013           0       216           munmap

明显的延迟出现在fsync函数中,即:

The obvious delay is in the fsync function, which is:

  • 可配置
  • 依赖于一般磁盘I/O(查看iotopiostat)
  • 严重依赖于 IOSS(因此,文件系统和磁盘分配 - 您可能会在 ext3 上获得一个值,在 xfs 上获得一个不同的值,在 btrfs 上获得第三个值)
  • 当然,间接取决于底层硬件及其怪癖或调整.

通过关闭同步,我的 SQLite3 性能提高了大约三千倍:

By turning syncing off, my SQLite3 performance increases by a factor of around three thousand:

$db = new PDO('sqlite:test.db');

$db->exec('pragma synchronous = off;');

我在两台非常相似的机器上也有两个不同的值(一个是 ext4,另一个是 XFS,但我不认为这是主要原因 - 它们的负载配置文件也不同).

I too have two different values on two very similar machines (one has ext4, the other XFS, but I'm not positive this is the main reason - their load profiles are also different).

顺便说一句,使用准备好的语句在最快的水平上几乎可以使执行速度翻倍(从 45k 到 110k INSERT,以 3000 为批次,因为在这个速度下 30 个 INSERT 必然会给出虚假的计时),并将最低速度从大约 6 提高到大约 150.

By the way, using prepared statements just about doubles the execution speed at the fastest level (from 45k to 110k INSERTs, in batches of 3000 since at that speed 30 INSERTs are bound to give spurious timings), and raises the lowest speed from about 6 to about 150.

所以这(使用准备好的语句)可能是一个很好的解决方案,可以在不涉及文件同步的情况下改进重复操作,即,同时仍然明显地确定数据风险级别保持不变.之后,我会根据数据中断的风险和价值尝试事务或 fsync(甚至可能是内存日志).

So this (using prepared statements) might be a good solution to improve repeated operations without touching file synchronization, i.e., while still being demonstrably sure that data risk level remains the same. After that I'd try transactions or fsync (maybe even memory journaling) depending on the risk and worth of a data outage.

从头开始设计系统时,对不同的 FS 进行一些测试肯定是可取的.

When designing a system from the ground up, some tests on different FS's are surely advisable.

ext4 (acl,user_xattr,data=order)         5.5 queries/s
using transactions                       170 queries/s
disabling fsync                        16000 queries/s
using transactions and disabling fsync 47200 queries/s

临时文件系统上,fsync 很便宜,所以关闭它几乎没有什么好处.大部分时间都花在守卫上,所以交易是关键.

On a temporary file system, fsync is cheap, so turning it off yields little benefit. Most of the time is spent guarding, so transactions are key.

tmpfs                                  13700 queries/s
disabling fsync                        15350 queries/s
enabling transactions                  47900 queries/s
using transactions and disabling fsync 48200 queries/s

当然,必须考虑适当的数据组织和索引编制,对于大型数据集,可能会变得更加重要.

Of course, proper data organization and indexing has to be taken into account and, for large data sets, might well turn out to be more important.

UPDATE:为了提高性能,还可以使用 pragma journal_mode=MEMORY;

UPDATE: to squeeze some more performance, one can also put the SQLite journal into memory with pragma journal_mode=MEMORY;

此外,您可以告诉 ext3/4 不要打扰更新 SQLite 数据库上的时间(但这在很大程度上取决于实现).可以尝试在数据库所在的文件系统中加入noatime,如果可以,可以放入/etc/fstab(也可以使用relatime 而不是更极端的 noatime:

Also, you can tell ext3/4 not to bother updating atime on SQLite databases (this hugely depends on the implementation, though). You can try adding noatime to the file system where the database resides, and if it works, you can put it into /etc/fstab (you can also use relatime instead of the more extreme noatime:

sudo mount /var -o remount,noatime

这篇关于为什么 SQLite 在特定机器上如此缓慢(~2 q/s)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆