connection.commit() 性能影响 [英] connection.commit() performance impact

查看:28
本文介绍了connection.commit() 性能影响的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在将巨大的日志文件解析到 sqlite 数据库中时,如果我们在每次插入后不调用 connection.commit() 或者这没有任何区别,是否会显着提高性能.我想问题是它是否只是为了分离事务,或者它是否做任何需要时间的事情.

When parsing huge log files into a sqlite database is there any major performance improve if we don't call connection.commit() after each insert or does that not make any difference. I guess the question is if it's only for separating transactions or if it does anything that needs time.

推荐答案

是的,提交越频繁,性能就会直接下降.

Yes, performance decreases directly with more frequent commits.

这是一个使用大约 1400 行的lorem ipsum"文本文件的玩具示例:

Here's a toy example working with a "lorem ipsum" text file of about 1400 lines:

import argparse
import sqlite3
import textwrap
import time

parser = argparse.ArgumentParser()
parser.add_argument("n", help="num lines per commit", type=int)
arg = parser.parse_args()

con = sqlite3.connect('lorem.db')
cur = con.cursor()
cur.execute('drop table if exists Lorem')
cur.execute('create table Lorem (lorem STRING)')
con.commit()

with open('lorem.txt') as f: lorem=textwrap.wrap(f.read())
print('{} lines'.format(len(lorem)))

start = time.time()
for i, line in enumerate(lorem):
    cur.execute('INSERT INTO Lorem(lorem) VALUES(?)', (line,))
    if i % arg.n == 0: con.commit()
stend = time.time()

print('{} lines/commit: {:.2f}'.format(arg.n, stend-start))

保存为 sq.py...:

$ for i in `seq 1 10`; do python sq.py $i; done
1413 lines
1 lines/commit: 1.01
1413 lines
2 lines/commit: 0.53
1413 lines
3 lines/commit: 0.35
1413 lines
4 lines/commit: 0.27
1413 lines
5 lines/commit: 0.21
1413 lines
6 lines/commit: 0.19
1413 lines
7 lines/commit: 0.17
1413 lines
8 lines/commit: 0.14
1413 lines
9 lines/commit: 0.13
1413 lines
10 lines/commit: 0.11

因此,将 commit 的时间减半几乎使操作所用的时间减半,依此类推——它不是完全线性的,但几乎是线性的.

So, halving the commits nearly halves elapsed time for the operation, and so on -- it's not quite linear but almost so.

为了完整性:每次提交 100 行将运行时间减少到 0.02.

For completeness: 100 lines per commit reduce runtime to 0.02.

在此基础上,您可以轻松地进行实验和测量时间,使数据库表更接近您的实际需要.

Riffing on this, you can easily experiment, and measure times, with DB tables closer to what you actually require.

这篇关于connection.commit() 性能影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆