什么是快?结构数组或数据表 [英] What's faster? Struct array or DataTable

查看:106
本文介绍了什么是快?结构数组或数据表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用LinqToSQL处理来自SQL Server的数据转存到iSeries服务器进行进一步的处理。 上这里更多细节。

I am using LinqToSQL to process data from SQL Server to dump it into an iSeries server for further processing. More details on that here.

我的问题是,它是采取约1.25分钟,处理那些350行数据。我仍然在试图破译来自SQL Server事件探查的结果,但也有疑问正在运行的一吨。在这里,更多的是一些细节上我在做什么:

My problem is that it is taking about 1.25 minutes to process those 350 rows of data. I am still trying to decipher the results from the SQL Server Profiler, but there are a TON of queries being run. Here is a bit more detail on what I am doing:

using (CarteGraphDataDataContext db = new CarteGraphDataDataContext())
{
    var vehicles = from a in db.EquipmentMainGenerals
                   join b in db.EquipmentMainConditions on a.wdEquipmentMainGeneralOID equals b.wdEquipmentMainGeneralOID
                   where b.Retired == null
                   orderby a.VehicleId
                   select a;

    et = new EquipmentTable[vehicles.Count()];

    foreach (var vehicle in vehicles)
    {
       // Move data to the array

       // Rates
       GetVehcileRates(vehicle.wdEquipmentMainGeneralOID);

       // Build the costs accumulators
       GetPartsAndOilCosts(vehicle.VehicleId);
       GetAccidentAndOutRepairCosts(vehicle.wdEquipmentMainGeneralOID);

       // Last Month's Accumulators
       et[i].lastMonthActualGasOil = GetFuel(vehicle.wdEquipmentMainGeneralOID) + Convert.ToDecimal(oilCost);
       et[i].lastMonthActualParts = Convert.ToDecimal(partsCost);
       et[i].lastMonthActualLabor = GetLabor(vehicle.VehicleId);
       et[i].lastMonthActualOutRepairs = Convert.ToDecimal(outRepairCosts);
       et[i].lastMonthActualAccidentCosts = Convert.ToDecimal(accidentCosts);

       // Move more data to the array

       i++;
   }
}



get方法都类似于:

The Get methods all look similar to:

private void GetPartsAndOilCosts(string vehicleKey)
{
   oilCost = 0;
   partsCost = 0;

   using (CarteGraphDataDataContext db = new CarteGraphDataDataContext())
   {
      try
      {
         var costs = from a in db.WorkOrders
                     join b in db.MaterialLogs on a.WorkOrderId equals b.WorkOrder
                     join c in db.Materials on b.wdMaterialMainGeneralOID equals c.wdMaterialMainGeneralOID
                     where (monthBeginDate.Date <= a.WOClosedDate && a.WOClosedDate <= monthEndDate.Date) && a.EquipmentID == vehicleKey
                     group b by c.Fuel into d
                     select new
                            {
                                isFuel = d.Key,
                                totalCost = d.Sum(b => b.Cost)
                            };

          foreach (var cost in costs)
          {
             if (cost.isFuel == 1)
             {
                oilCost = (double)cost.totalCost * (1 + OVERHEAD_RATE);
             }
             else
             {
                partsCost = (double)cost.totalCost * (1 + OVERHEAD_RATE);
             }
          }
       }
       catch (InvalidOperationException e)
       {
          oilCost = 0;
          partsCost = 0;
       }
    }

    return;
 }



在这里我的想法是查询数量削减到数据库应加快的处理。如果LINQ做一个SELECT对每条记录,也许我需要每个记录第一次加载到内存中。

My thinking here is cutting down the number of queries to the DB should speed up the processing. If LINQ does a SELECT for every record, maybe I need to load every record into memory first.

我仍然认为自己与C#和面向对象的一般(我在iSeries大多RPG编程)初学者。所以我猜我做一些愚蠢的事。你能帮我解决我的愚蠢(至少在这个问题)

I still consider myself a beginner with C# and OOP in general (I do mostly RPG programming on the iSeries). So I am guessing I am doing something stupid. Can you help me fix my stupidity (at least with this problem)?

更新:想我会回来,并更新你我已经发现。它看起来像数据库设计不当。无论LINQ的背景是产生这是非常低效的代码。我不是说的LINQ是坏,它只是坏该数据库。我转换为迅速扔在一起.XSD设置和处理时间从1.25分钟过去了15秒。有一次,我做了正确的重新设计,我只能猜测,我会刮胡子的几秒钟内关闭这一点。谢谢大家对您的意见。我会尝试再次LINQ改天更好的数据库。

Update: Thought I would come back and update you on what I have discovered. It appears like the database was poorly designed. Whatever LINQ was generating in the background it was highly inefficient code. I am not saying the LINQ is bad, it just was bad for this database. I converted to a quickly thrown together .XSD setup and the processing time went from 1.25 minutes to 15 seconds. Once I do a proper redesign, I can only guess I'll shave a few more seconds off of that. Thank you all for you comments. I'll try LINQ again some other day on a better database.

推荐答案

有我在你的代码中发现的几件事情

There are a few things that I spot in your code:


  1. 您为每个项目数据库的多次查询在变种车的查询,您可能要重写查询,以便少需要数据库查询。

  2. 当你不需要查询实体的所有属性,还是需要一个实体的子实体,它的性能更好地使用匿名类型的选择。 LINQ to SQL的将分析这一点,从数据库中检索的数据量。这样的选择可能是这样的:选择新的{a.VehicleId,a.Name}

  3. 在查询中的 GetPartsAndOilCosts 可以通过将计算 cost.totalCost *(1 + OVERHEAD_RATE)中的LINQ查询。这样,查询可以在数据库完全执行,应该让它更快。

  4. 您正在做一个计数() VAR车查询,但你只能用它来确定数组的大小。虽然LINQ to SQL中会做出非常有效的 SELECT COUNT(*)它查询,它需要一个额外的往返到数据库。除此之外(取决于您的隔离级别)开始迭代查询可以增加一个项目的时间。在这种情况下,您的数组太小和 ArrayIndexOutOfBoundsException异常将被抛出。你可以简单地使用 .ToArray()上查询或创建一个列表< EquipmentTable> 并调用 .ToArray()上。这通常是足够快,尤其是当你只有在此集合仅380项目,它当然会比有一个额外的往返到数据库中(计数)更快。

  5. 正如你可能已经想到,数据库查询量是实际的问题。结构数组或数据表之间的切换将不执行什么不同。使用

  6. 您优化掉尽可能多的查询,你可以,开始分析离开了查询(使用SQL事件探查器)后和优化这些查询索引优化向导。它会提出一些新的指标对你来说,这可大大加快速度。

  1. You query the database multiple times for each item in the 'var vehicles' query, you might want to rewrite that query so that less database queries are needed.
  2. When you don't need all the properties of the queried entity, or need sub entities of that entity, it's better for performance to use an anonymous type in your select. LINQ to SQL will analyze this and retrieve less data from your database. Such a select might look like this: select new { a.VehicleId, a.Name }
  3. The query in the GetPartsAndOilCosts can be optimized by putting the calculation cost.totalCost * (1 + OVERHEAD_RATE) in the LINQ query. This way the query can be executed in the database completely, which should make it much faster.
  4. You are doing a Count() on the var vehicles query, but you only use it for determining the size of the array. While LINQ to SQL will make a very efficient SELECT count(*) query of it, it takes an extra round trip to the database. Besides that (depending on your isolation level) the time you start iterating the query an item could be added. In that case your array is too small and an ArrayIndexOutOfBoundsException will be thrown. You can simply use .ToArray() on the query or create a List<EquipmentTable> and call .ToArray() on that. This will normally be fast enough especially when you only have only 380 items in this collection and it will certainly be faster than having an extra roundtrip to the database (the count).
  5. As you probably already expect, the amount of database queries are the actual problem. Switching between struct array or DataTable will not perform much different.
  6. After you optimized away as much queries that you could, start analyzing the queries left (using SQL profiler) and optimize these queries using the Index tuning wizard. It will propose some new indexes for you, that could speed things up considerably.

对于点#1一点点额外的解释。你在做什么这里是一个有点像这样:


A little extra explanation for point #1. What you're doing here is a bit like this:

var query = from x in A select something;

foreach (var row in query)
{
    var query2 = from y in data where y.Value = row.Value select something;

    foreach (var row2 in query2)
    {
        // do some computation.
    }
}



你应该尽量做到的是去掉 QUERY2 子查询,因为它是在顶部查询的每一行执行。所以,你可能最终的东西是这样的:

What you should try to accomplish is to remove the query2 subquery, because it is executing on each row of the top query. So you could end up with something like this:

var query =
    from x in A
    from y in B
    where x.Value == y.Value
    select something;

foreach (var row in query)
{
}

当然,这个例子是简单化,在现实生活中它会变得相当复杂(如你已经注意到)。你的情况也是因为你有多个那些'子查询'的。它可以带你一段时间才能得到这个权利,尤其是你缺乏LINQ的知识,SQL的(如你所说你自己)。

Of course this example is simplistic and in real life it gets get pretty complicated (as you’ve already noticed). In your case also because you've got multiple of those 'sub queries'. It can take you some time to get this right, especially with your lack of knowledge of LINQ to SQL (as you said yourself).

如果您不能图出来,你可以随时要求再次在这里#1,但是请记住剥夺你的问题,以尽可能小的东西,因为它不好玩了别人的烂摊子阅读(我们没有得到支付此):-)
祝你好运。

If you can't figure it out, you can always ask again here at Stackoverflow, but please remember to strip your problem to the smallest possible thing, because it's no fun to read over someone's mess (we're not getting paid for this) :-) Good luck.

这篇关于什么是快?结构数组或数据表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆