如何同步将string [] []转换为datarow的任务 [英] How do I synchronise tasks transforming string[][] to datarow

查看:87
本文介绍了如何同步将string [] []转换为datarow的任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨社区,



我是并行编码的新手,在理解我所读到的内容时遇到一些麻烦。



让我们假设以下场景:



我在sql server中有5个表,我收到数据源X需要进行过滤,排序和验证,结果string [] []需要转换为dataRow []并上传到sql server表。





我的sequentiel解决方案有效,但说实话 - 我有一个八核处理器...



为了改变我的问题,我想我可能会使用带有for循环的TPL。 (迭代0到4 =每桌1个)。基于迭代索引,我将在我的数组上执行特定的LINQ查询,然后简单地使用相应的表模式获取新的DataRow并填充其字段。我已将表ID设置为自动递增。因此,我的sequentiel解决方案不会为我的DataRow的ID列提供值(无论如何它都将由SQL服务器完成)。



问题:

字符串[] []存在于TPL for循环之外,因此需要同步 - 这是正确的吗?



我在TPL for循环中创建的所有变量都是线程安全的,这是否更为正确 - 因此我不需要同步它们?在异常方面创建带有for循环的DataRow的意义应该没问题?



关于TPL for循环的最后一个问题。它会在主线程继续之前自动等待所有任务,还是必须调用Task.WaitAll()?在这种情况下,创建单独的任务将它们添加到数组并执行Task.WaitAll(arrayOfTasks)不是更好吗?



我正在使用ADO DataTables - >因此,我想等待对本地表的所有更新/更改,然后立即更新整个数据库。





I我很高兴伪解决方案,但我对正确理解这个概念更感兴趣。我是否正确地解决了这个问题,还是应该创建一个正常的for循环并在for循环中创建任务?



一如既往 - 感谢您的帮助和时间。



-DK

Hi community,

I am new to parallel coding and have some trouble understanding what I have read about it.

Let's assume the following scenario:

I have 5 tables in a sql server and I receive data from source X which needs to be filtered,sorted and validate and the resulting string[][] needs to be transformed to dataRow[] and uploaded to the sql server tables.


My sequentiel solution works, but to be honest - I have an eight core processor...

To transform my problem I thought I might use the TPL with a for loop. (iterating 0 to 4 = 1 taks per table). Based on the iteration index I would perform a specific LINQ query on my array and then simply take a new DataRow with the respective table schema and populate its fields. I have set the tables ID to increment automatically. My sequentiel solution therefore does not provide a value for the ID column of my DataRow (it will be done by the SQL server anyways).

Problem:
The string[][] exists outside of the TPL for loop and therefore needs synchronisation - is that correct ?

Is it further correct that all variables that I create within the TPL for loop are threadsafe - and that I therefore do not need to synchronise them? Meaning creating the DataRow withing the for loop should be fine in terms of exceptions ?

Last question regarding the TPL for loop. Will it automatically wait for all tasks before the main thread continues or do I have to Call Task.WaitAll() ? In that case, wouldn't it be better to create individual tasks add them to an array and do Task.WaitAll(arrayOfTasks) ?

I am using ADO DataTables -> therefore I want to wait for all the updates/changes to the local tables and then simply update the entire database at once.


I am happy about pseudo solutions, but I am more interested about understanding the concept correctly. Am I approaching this problem correctly or should I create a normal for loop and within the for loop create Tasks ?

As always - thanks for your help and time.

-DK

推荐答案

你有一个字符串[] []:

You have a string[][]:
string[][] myArray = new string[][] {};



您可以为所有表使用一种方法,也可以为每种方法定义单独的方法表。



对所有表使用一种方法,你定义一个表的列表


You can either use one method for all tables, or define a separate method for each table.

Using one method for all tables, you define a list of tables

List<string> tables = new List<string> { "table1", "table2", "table3", "table4", "table5" };



接下来,使用PLINQ并行执行每个表的方法:


Next, you use PLINQ to execute your method for each table in parallel:

tables.AsParallel().ForAll(table => DoSomethingWithTheArray(myArray, table));

...

private static void DoSomethingWithTheArray(string[][] myArray, string table)
{
    switch (table)
    {
        case "table1":
            // Perform transformation of array to table 1
            break;
        // etc.
    }
}



如果你想更好地构建你的代码,并为每个表定义一个单独的方法,你可以将表的列表更改为方法列表:


If you want to structure your code a bit better, and define a separate method for each table, you change the list of tables to be a list of methods:

List<Action<string[][]>> transformations = new List<Action<string[][]>>
{
    TransformArrayForTableOne,
    // etc...
};



然后,更改您的PLINQ查询以并行运行所有方法:


Then, change your PLINQ query to run all the methods in parallel:

transformations.AsParallel().ForAll(transformation => transformation(myArray));

...

private static void TransformArrayForTableOne(string[][] obj)
{
    // Do something with the array specific for this table
}

private static void TransformArrayForTableTwo(string[][] obj)
{
    // Do something with the array specific for this table
}

// etc.



在每种方法中,您都将执行到特定DataRows的转换。由您决定如何将这些提交到数据库。


In each of the methods, you would perform the transformation to the specific DataRows. It's up to you how you want to commit these to the database.


这篇关于如何同步将string [] []转换为datarow的任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆