使用SQLite3 + Node.js的最佳实践 [英] Best practices for using SQLite3 + Node.js

查看:111
本文介绍了使用SQLite3 + Node.js的最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个适度的Node.js脚本,该脚本通过API从Wikipedia中提取数据并将其存储在SQLite数据库中.我正在使用 node-sqlite3 模块.

I've got a modest Node.js script that pulls down data from Wikipedia via the API and stores it in a SQLite database. I'm using this node-sqlite3 module.

在某些情况下,我要提取多达60万篇文章的数据,并在数据库中连续存储有关每篇文章的一些元数据.从API中以500为一组检索文章.

In some cases, I'm pulling down data on upward of 600,000 articles and storing some metadata about each one in a row in the db. The articles are retrieved in groups of 500 from the API.

使用500篇文章中的数据检索JSON对象的请求将对象传递给此回调:

The request that retrieves the JSON object with the data on the 500 articles passes the object to this callback:

// (db already instantiated as 'new sqlite.Database("wikipedia.sqlite");')

function callback(articles) {
    articles.forEach(function (article) {
        db.run(
            "INSERT OR IGNORE INTO articles (name, id, created) VALUES (?,?,?)", 
            [
                article["title"], 
                article["pageid"], 
                article["timestamp"]
            ]
        );
    });
}

默认情况下,模块并行运行,但是node-sqlite3的文档包含一个串行操作示例,如下所示:

The modules operates by default in parallel, but the documentation for node-sqlite3 includes one example of serial operations like so:

db.serialize(function () {
    db.run("CREATE TABLE lorem (info TEXT)");

    var stmt = db.prepare("INSERT INTO lorem VALUES (?)");
    for (var i = 0; i < 10; i++) {
        stmt.run("Ipsum " + i);
    }
    stmt.finalize();
}

我试图模仿这一点,但几乎没有发现性能差异.我做错了吗?现在,从API检索数据的速度比写入数据库的速度要快得多,尽管这并不是很慢.但是,用600K个单独的 INSERT 命令来重击数据库感觉很笨拙.

I tried to imitate this and saw almost no performance difference. Am I doing it wrong? Right now, the data retrieves from the API much faster than it writes to the DB, though it's not intolerably slow. But pummeling the DB with 600K individual INSERT commands feels clumsy.

更新:根据公认的答案,这似乎适用于node-sqlite3,而不是本机解决方案.(请参见问题).

UPDATE: Per accepted answer, this appears to work for node-sqlite3, in lieu of a native solution. (See this Issue).

db.run("BEGIN TRANSACTION");
function callback(articles) {
    articles.forEach(function (article) {
        db.run(
            "INSERT OR IGNORE INTO articles (name, id, created) VALUES (?,?,?)",
            [
                article["title"],
                article["pageid"],
                article["timestamp"]
            ]
        );
    });
}
db.run("END");

推荐答案

在对SQLite数据库进行多次插入时,需要将插入的集合包装到事务中.否则,SQLite将等待磁盘插入后完全旋转,同时对插入的每个记录进行写后读取验证.

When you are doing several insertions into a SQLite database, you need to wrap the collection of insertions into a transaction. Otherwise, SQLite will wait for the disk platters to spin completely around for each insert, while it does a read-after-write verify for each record that you insert.

在7200 RPM时,磁盘盘再次旋转大约需要1/60秒,这是计算机时间的永恒.

At 7200 RPM, it takes about 1/60th of a second for the disk platter to spin around again, which is an eternity in computer time.

这篇关于使用SQLite3 + Node.js的最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆