数据插入需要更长的时间才能执行更多的迭代 [英] Data inserts take longer for more iterations executed

查看:111
本文介绍了数据插入需要更长的时间才能执行更多的迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个空数据库,其中包含旧版旧数据库的迁移形式.

I have an empty database which will contain a migrated form of an old legacy database.

我将所有旧数据读入DataTables中,效果很好.

I read in all of the old data into DataTables which works fine.

有一个主表,其中包含几乎每个表的链接,因此可以循环访问.对于需要进入主表的每条记录,大约有7组表,每组中只有彼此依赖的表才能工作.因此,例如,订单表与订单行表位于同一组中,因为一个订单表依赖另一个订单表.

There is one master table which contains links for almost every table, so this is looped through. For every record that needs to go into the master table there are about 7 groups of tables, in each of which are only tables that rely on each other to work. so for example the Orders Table is in the same group as the OrderLine table as one relies on the other.

由于这7个小组中的每个小组都可以在没有其他小组任何信息的情况下完成工作,因此我为每个小组以不同的线程开始了迁移过程.

As each of these 7 groups can be done without any information from another group I start the migration process with a different thread for each group.

每种方法都只运行旧数据表中的相关记录,并对它们进行清理,然后将它们插入新数据库中.

Each method simply runs through the relevant records from the legacy data table and sanitises them and inserts them into the new database.

我有一个数据访问类,在该类的整个生命周期中,它都保持SQLCeConnection对象处于打开状态.

I have a data access class that keeps an SQLCeConnection object open for the lifetime of the class.

每个插入和读取操作都会触及以下两种方法:

Every insert and read operation hits these two methods:

/// <summary>
/// Executes a single INSERT, UPDATE, DELETE or other Sql Command that modifies the schema or data of the database
/// </summary>
/// <param name="sql">The command to execute</param>
/// <param name="parameters">Any parameters in the command</param>
public void ExecuteCommand(string sql, SqlServerCeParameter[] parameters)
{
    //print debug statements if necessary
    if (_outputSqlStatementsToFile == true) PrintSqlDebuggingInformation(sql, parameters);

    //create the command that will execute the Sql
    using (var command = new SqlCeCommand(sql, _connection))
    {
        //add any parameters
        if (parameters != null) command.Parameters.AddRange(parameters.Select(p => p.ParameterBehind).ToArray());

        //open the connection 
        if (_connection.State == ConnectionState.Closed)
        {
            _connection.Open();
        }

        //execute the command
        command.ExecuteNonQuery();

    }
}

 /// <summary>
    /// Executes a query that returns a single value, for example a COUNT(*) query
    /// </summary>
    /// <typeparam name="T">The type of the value returned by the query, for example COUNT(*) would be an Integer</typeparam>
    /// <param name="sql">The query to execute</param>
    /// <param name="parameters">Any parameters in the query</param>
    /// <returns>A single value cast to type T</returns>
    public T ExecuteQuery<T>(string sql, SqlServerCeParameter[] parameters)
    {
        //print debug statements if necessary
        if (_outputSqlStatementsToFile == true) PrintSqlDebuggingInformation(sql, parameters);

        //the result
        T result;

        //create the command that will execute the Sql
        using (var command = new SqlCeCommand(sql, _connection))
        {
            //add any parameters
            if (parameters != null) command.Parameters.AddRange(parameters.Select(p => p.ParameterBehind).ToArray());

            //open the connection 
            if (_connection.State == ConnectionState.Closed)
            {
                _connection.Open();
            }

            //execute the command
            var sqlResult = command.ExecuteScalar();

            //cast the result to the type given to the method
            result = (T)sqlResult;
        }
        //return the result
        return result;
    }

每次完成一条记录即是整个记录,并且与该记录相关联的所有内容都将完全迁移.

Every time one record is done that is the entire record and everything associated with that record fully migrated.

我的秒表覆盖了整个迭代的代码,因此我可以计时每次迭代所花费的毫秒数的平均值.

I have a stop watch running covering the entire code of the iteration so I can time the average of how many milliseconds per iteration it is taking.

在32000+行的开头,毫秒数在180到220毫秒之间,但是随着时间的流逝,该数字稳步增加,直到每次迭代超过2秒为止.

At the beginning of the 32000+ rows the number of milliseconds is in the region of 180 - 220 milliseconds but as time goes on this figure steadily increases until it gets way over 2 seconds per iteration.

每条记录略有不同,有些记录本质上需要更长的时间才能完成,但是我敢肯定,这种记录不会持续增加.我希望它在迁移的初期会大幅度波动,然后稳定在一个相对稳定的数字上.

Each record is slightly different with some by nature taking longer to complete but I am pretty sure that there should not be this constant increase. I expected it to fluctuate widly in the early part of the migration then settle down to a relatively consistent figure.

我想知道这是否与SQLServerCe连接有关,也许您在不关闭它的情况下使用它越多,它得到的速度就越慢?

I am wondering if it is something to do with the SQLServerCe connection, perhaps the more you use it without closing it the slower it gets?

  1. C#
  2. Visual Studio 2012
  3. SqlServerCe 4.0

推荐答案

您应该考虑查看目标表上的聚集索引.它应该是小的(理想情况下是整数)递增并且唯一.如果您为聚簇索引或Guid使用业务密钥,则存在页面拆分的风险,这会导致加载随着时间的推移而变慢.

You should consider taking a look at the clustered index on the target table. It should be small ( ideally and integer,) ascending, and unique. If you are using a business key for your clustered index or a guid, then you run the risk of page splits which would have the effect of slowing the load over time.

您还可以考虑删除任何外键约束或索引,然后在完成后重新添加它们.

You may also consider dropping any foreign key constraints or indexes, then re-adding them upon completion.

这似乎与索引有关.确定此问题的简单测试是每10K迭代左右截断一次表.如果不再看到速度下降,则可能是由于插入单个记录的IO所致.

This seems like something to do with indexes. An easy test to determine this is truncate the tables every 10K iterations or so. If you no longer see the slow-down, then it is likely due to the IO of inserting individual records.

这篇关于数据插入需要更长的时间才能执行更多的迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆