使用SqlBulkCopy和Azure并行批量插入 [英] Parallel Bulk Inserting with SqlBulkCopy and Azure

查看:73
本文介绍了使用SqlBulkCopy和Azure并行批量插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在云上有一个带有SQL azure数据库的azure应用程序.我有一个工作角色,需要对文件(最多约3000万行)进行解析和处理,因此我不能直接使用BCP或SSIS.

I have an azure app on the cloud with a sql azure database. I have a worker role which needs to do parsing+processing on a file (up to ~30 million rows) so i can't directly use BCP or SSIS.

我目前正在使用SqlBulkCopy,但是这似乎太慢了,因为我看到40万行的加载时间长达4-5分钟.

I'm currently using SqlBulkCopy, however this seems too slow as I've seen load times of up to 4-5 minutes for 400k rows.

我想并行运行批量插入;但是,通读关于并行/控制锁行为导入数据的文章时,它说SqlBulkCopy要求该表没有聚簇索引,并且需要指定一个表锁(BU锁).但是,Azure表必须具有聚集索引...

I want to run my bulk inserts in parallel; however reading through the articles on importing data in parallel/controlling lock behaviour, it says that SqlBulkCopy requires that the table does not have clustered indexes and a tablelock (BU lock) needs to be specified. However azure tables must have a clustered index...

是否甚至可以在SQL Azure的同一表上并行使用SqlBulkCopy?如果没有,那么可以使用其他API(可以在代码中使用)吗?

Is it even possible to use SqlBulkCopy in parallel on the same table in SQL Azure? If not is there another API (that I can use in code) to do this?

推荐答案

我看不出如何比使用SqlBulkCopy更快地运行.在我们的项目中,我们可以在3分钟左右的时间内导入25万行,因此您的费率似乎是正确的.

I don't see how you can run any faster than using SqlBulkCopy. On our project we can import 250K rows in about 3 mins, so your rate seems about right.

即使技术上可行,我也不认为并行执行会有所帮助.我们一次只运行1次导入,否则SQL Azure开始使我们的请求超时.

I don't think that doing it in parallel would help, even if it was technically possible. We only run 1 import at a time otherwise SQL Azure starts timing out our requests.

实际上,有时无法在导入的同时运行大型的group-by查询.SQL Azure为确保服务质量做了很多工作,其中包括超时处理时间过长,占用过多资源等请求.

In fact sometimes, running a large group-by query at the same time as the import isn't possible. SQL Azure does a lot of work to ensure quality of service, this includes timing out requests that take too long, take too many resource, etc

因此同时进行几个大批量插入将可能导致一次超时.

So doing several large bulk inserts at the same time will probably cause one to time out.

这篇关于使用SqlBulkCopy和Azure并行批量插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆