数据库插入性能 [英] Database insert performance

查看:95
本文介绍了数据库插入性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正计划实施一个系统,以将频繁出现的市场报价记录到数据库中,以进行进一步的分析.为了简单地了解我们可以在不同的数据库解决方案上获得什么样的存储性能,我创建了一个用于插入基本行价格信息的小应用程序.在几个不同的数据库上运行相同的代码时,我们得到了一些有趣的结果.

We are planning to implement a system for logging a high frequency of market ticks into a DB for further analysis. To simply get a little what kind of storage performance we can get on the different DB solutions I created a little application for inserting a basic row of tick information. When running the same code on a couple of different DBs we got some interesting results.

要插入的数据非常简单,如下所示:

The data being inserted is very simple like follows :

CREATE TABLE [dbo].[price](
    [product_code] [char](15) NULL,
    [market_code] [char](10) NULL,
    [currency] [nchar](6) NULL,
    [timestamp] [datetime] NULL,
    [value] [float] NULL,
    [price_type] [char](4) NULL
) ON [PRIMARY]

Microsoft SQL Server:

总测试时间:32秒.每秒3099个价格.

Total test time : 32 seconds. 3,099 prices per second.

MySQL服务器:

总测试时间:18秒.每秒5349个价格.

Total test time : 18 seconds. 5,349 prices per second.

MongoDB服务器:

总测试时间:3秒.每秒25,555个价格.

Total test time : 3 seconds. 25,555 prices per second.

该测试的目的只是为了稍微了解一下底层系统的预期原始性能".在实际实施解决方案时,我们当然会进行缓冲,批量插入等.

The purpose of this testing is simply to get a little indication of what kind of "raw performance" can be expected of the systems in the bottom. When actually implementing a solution we would of course do buffering, bulk inserts etc.

我们只关心插入的速度,因为查询稍后会离线"进行.

We only care about the speed of the inserts, as the querying is done "offline" later.

有人对适合的其他数据库有任何建议吗?我也将在今晚晚些时候尝试使用HDF5和MonetDB.必须具有多客户端访问权限.

Does anyone have any suggestions for other databases that could fit? I will be trying with HDF5 and MonetDB later tonight too. Its required to have multi client access.

谢谢您的建议!

已更新:

对不起,但是我在放置之前对问题进行了重大修改,看来我没有考虑服务器版本和硬件的一些详细信息了.所有测试均在具有12GB RAM且运行Windows 2008 x64的8核服务器上进行.

Sorry, but i did a major edit of my question before positing, and it seems i left out the server versions and some details of the hardware. All tests were on an 8 core server with 12GB RAM running Windows 2008 x64.

Microsoft SQL Server 2008企业版x64. 作为InnoDB表运行的MySQL 5.1.44. MongoDB 1.2.4 x64

Microsoft SQL Server 2008 Enterprise x64. MySQL 5.1.44 running as InnoDB table. MongoDB 1.2.4 x64

当前测试是一个简单的向数据库中插入行的循环,将来自纳斯达克的真实历史数据编译为已导入内存的CSV文件.该代码在C#NET4 x64中.

The current test is a simple loop of row inserts into the DBs with real historic data from NASDAQ compiled in a CSV file already imported to memory. The code was in C# NET4 x64.

MS SQL和MySQL服务器已调整"为完美的设置,而MongoDB仅使用默认设置进行了设置. SQL表的设置没有索引,这是因为DB的目的很简单,只是作为转移到主分析系统之前的过渡地面.

The MS SQL and MySQL servers were "tuned" to perfect settings, while the MongoDB was just set up with defaults. The SQL tables are set up with no indices, as the purpose of the DB is simple as a staging ground before being transfered into the main analysis system.

许多建议使用批量插入,但是这样做很困难,因为我们有多个客户端独立于实时流将单个滴答声推入数据库.为了允许这种方法,我们将不得不扩展数据库前面的层,超出我们现在有机会进行测试的范围.但是,我认为对于最终的体系结构将需要做一些事情,因为我们从除MongoDB之外的所有事物获得的数量不足以处理所需的输入数量.

Many suggested Bulk inserts, however its a difficult way of doing it as we have several clients pushing single ticks into the DB independently from live streams. To allow for such methods, we would have to expand the layer in front of the DB beyond what we have a chance to test for right now. However I imagine something will have to be done for the final architecture, as the numbers we are getting from everything except the MongoDB is not enough to handle the number of inputs needed.

更新2:SSD驱动器确实非常适合此用途,我们自己使用了它.但是最终产品将安装在几个不同的客户上,这些客户都提供自己的设备..仍然很难从IT部门获得带有SSD的服务器...:(

UPDATE 2: SSD drives are indeed great for just this, and we are using this ourselves. However the final product will be installed at a few different customers which all provide their own iron.. and getting servers from the IT department with SSD is still hard... :(

更新3:

我尝试了建议的BulkCopy方法.与其他循环相同的循环的性能,但首先进入数据表,然后将BulkInsert进入SQL Server,结果如下:

I tried the BulkCopy approach suggested. Performance for the same loop as the others, but first into a DataTable and then BulkInsert into the SQL Server resulted in the following :

Microsoft SQL Server(批量):

总测试时间:2秒.每秒39401价格.

Total test time : 2 seconds. 39401 prices per second.

推荐答案

我只能在sql-server上发表评论,但是可以尝试以下操作:

I can only really comment on sql-server, but there are some things to try:

  • 命令批处理(即在一次命中数据库中多次执行INSERT)
  • 批量插入(通过 SqlBulkCopy )

两者都应在单行插入上提供显着的改进(后者最快)

either should give significant improvements on single-row inserts (the latter being fastest)

这篇关于数据库插入性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆