表值参数:小块发送数据 [英] Table Valued Parameter: sending data in small chunks
问题描述
我从一个CSV文件中读取和发送数据表变量的存储过程。从我的测试,到目前为止,我能够处理3分钟30秒300K记录。该文件可能包含多达数百万条记录,因为我们去。我想知道,如果它是一个好主意,所有的这些记录发送到存储过程一气呵成或者我应该给他们以发言权500K批次?我已经设置了命令超时为1800。
I am reading from a csv file and sending data as table variable to a stored procedure. From what i have tested so far , I am able to process 300k records in 3 mins 30 seconds . The file may contain up to millions of records as we go. I wanted to know if its a good idea to send all these records to the stored procedure in one go or Should I send them in batches of say 500k? I have set the command timeout to 1800.
推荐答案
使用的一个例子的IEnumerable的SqlDataRecord
An example of using IEnumerable SqlDataRecord
请注意我的排序。这是聚簇索引。索引的碎片绝对会杀了加载速度。第一个实现使用插入值(未分类),并在12小时运行这个版本简直是100倍速度更快。我也禁用索引比PK和重新索引在负载端等。从长远来看,我收到大约500行/秒。你的样品是1400 /秒的那么大。如果你开始看到降解那么事情来看待。
Notice I sort. This is by the clustered index. Fragmentation of the indexes will absolutely kill load speed. The first implementation used Insert Values (unsorted) and in a 12 hour run this version is literally 100x faster. I also disable indexes other than the PK and reindex at the end of the load. In a long run I am getting about 500 rows / second. Your sample is 1400 / second so great. If you start to see degradation then things to look at.
public class DocFTSinXsCollection : List<DocFTSinX>, IEnumerable<SqlDataRecord>
{
// used by TVP for fast insert
private int sID;
private IEnumerable<DocFTSinX> docFTSinXs;
IEnumerator<SqlDataRecord> IEnumerable<SqlDataRecord>.GetEnumerator()
{
//todo fix the order in 3 to sID, wordID1, workID2
var sdr = new SqlDataRecord(
new SqlMetaData("wordID1", System.Data.SqlDbType.Int),
new SqlMetaData("wordID2", System.Data.SqlDbType.Int),
new SqlMetaData("sID", System.Data.SqlDbType.Int),
new SqlMetaData("Delta", System.Data.SqlDbType.Int));
foreach (DocFTSinX oh in docFTSinXs.OrderBy(x => x.Word1).ThenBy(x => x.Word2))
{
sdr.SetInt32(0, oh.Word1);
sdr.SetInt32(1, oh.Word2);
sdr.SetInt32(2, sID);
sdr.SetInt32(3, (Int32)oh.Delta);
yield return sdr;
}
}
public DocFTSinXsCollection(int SID, IEnumerable<DocFTSinX> DocFTSinXs)
{
sID = SID;
docFTSinXs = DocFTSinXs;
//Debug.WriteLine("DocFTSinXsCollection DocFTSinXs " + DocFTSinXs.Count().ToString());
}
}
其他工具来考虑的是使用SqlBulkCopy .NET类和Drapper。
Other tools to consider are the SQLBulkCopy .NET class and Drapper.
OP问如何分批进行。
OP asked how to perform in batches.
while (true)
{
// if no more break;
// fill list or datatable with next 100000
// send list or datatable to db
}
这篇关于表值参数:小块发送数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!