任务并行库与 async/await 混合 [英] Task Parallel Library mixed with async/await

查看:35
本文介绍了任务并行库与 async/await 混合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Web 应用程序中,我们为应用程序中的各种数据库表提供分页搜索面板.我们目前允许用户选择单个行,并通过 UI 在每个选定的实例中执行一些操作.

In a web application, we provide paginated search panels for various database tables in our application. We currently allow users to select individual rows, and via a UI, execute some operation in each selected instance.

例如,一组文档记录提供了删除文档的功能.用户可以选中代表 15 个文档标识符的 15 个复选框,然后选择选项 > 删除.这工作得很好.

For example, a panel of document records offers an ability to delete documents. A user may check 15 checkboxes representing 15 document identifiers, and choose Options > Delete. This works just fine.

我希望为用户提供一个选项来执行某些操作对匹配查询的所有行用于在面板中显示数据.

I wish to offer the users an option to execute some operation for all rows matching the query used to display the data in the panel.

我们可能有 5,000 个符合某些搜索条件的文档,并希望允许用户删除所有 5,000 个.(我知道这个例子有点做作;让我们忽略允许用户批量删除文档的智慧"!)

We may have 5,000 documents matching some search criteria, and wish to allow a user to delete all 5,000. (I understand this example is a bit contrived; let's ignore the 'wisdom' to allowing users to delete documents in bulk!)

对数千行的方法执行是一个长时间运行的操作,所以我将把该操作排入队列.将其视为 Gmail 将过滤器应用于符合某些搜索条件的所有电子邮件对话的等效功能.

Execution of a method for thousands of rows is a long-running operation, so I will queue the operation instead. Consider this an equivalent of Gmail's ability to apply a filter to all email conversations matching some search criteria.

我需要执行一个将返回未知行数的查询,并且对于每一行,向队列中插入一行(在下面的代码中,队列由ImportFileQueue 表示).

I need to execute a query that will return an unknown number of rows, and for each row, insert a row into a queue (in the code below, the queue is represented by ImportFileQueue).

我编码如下:

using (var reader = await source.InvokeDataReaderAsync(operation, parameters))
{
    Parallel.ForEach<IDictionary<string, object>>(reader.Enumerate(), async properties =>
    {
        try
        {
            var instance = new ImportFileQueueObject(User)
            {
                // application tier calculation here; cannot do in SQL
            };
            await instance.SaveAsync();
        }
        catch (System.Exception ex)
        {
            // omitted for brevity
        }
    });
}

在使用事务包装调用的单元测试中运行此程序时,我收到 System.Data.SqlClient.SqlException: Transaction context in use by another session. 错误.

When running this in a unit test that wraps the call with a Transaction, I receive a System.Data.SqlClient.SqlException: Transaction context in use by another session. error.

这很容易解决:

  • 将数据库调用从异步更改为同步,或
  • 移除 Parallel.Foreach,并以串行方式遍历读取器.

我选择了前者:

using (var reader = await source.InvokeDataReaderAsync(operation, parameters))
{
    Parallel.ForEach<IDictionary<string, object>>(reader.Enumerate(), properties =>
    {
        try
        {
            var instance = new ImportFileQueueObject(User)
            {
                // Omitted for brevity
            };
            instance.Save();
        }
        catch (System.Exception ex)
        {
            // omitted for brevity
        }
    });
}

我的思考过程是,在典型的用例中:

My thought process is, in typical use cases:

  • 外部阅读器通常有数千行
  • instance.Save() 调用是轻量级的";在数据库中插入一行

两个问题:

  1. Parallel.Foreach 中是否有合理的方法使用 async/await,其中内部代码使用 SqlConnection(避免TransactionContext 错误)
  2. 如果不是,考虑到我预期的典型用例,我会选择利用 TPL 并放弃 async/await 以实现合理的单行保存
  1. Is there a reasonable way to use async/await inside the Parallel.Foreach, where the inner code is using SqlConnection (avoiding the TransactionContext error)
  2. If not, given my expected typical use case, is my choice to leverage TPL and forfeit async/await for the single-row saves reasonable

什么中建议的答案是另一个会话正在使用事务上下文"的原因 说:

尽可能避免多线程数据操作(无论加载或保存).例如.将 SELECT/UPDATE/等...请求保存在一个单个队列并使用单线程工作者为它们提供服务;

Avoid multi-threaded data operations if it's possible (no matter loading or saving). E.g. save SELECT/UPDATE/ etc... requests in a single queue and serve them with a single-thread worker;

但我试图最大限度地减少总执行时间,并认为 Parallel.Foreach 更有可能减少执行时间.

but I'm trying to minimize total execution time, and figured the Parallel.Foreach was more likely to reduce execution time.

推荐答案

打开一个事务然后在保持它打开的情况下等待 I/O 几乎总是一个坏主意.通过首先缓冲数据,您将获得更好的性能(和更少的死锁).如果总数据多于您可以轻松地在内存中缓冲的数据,请将其缓冲为一千行左右的块.如有可能,将其中的每一个都放在一个单独的事务中.

It's almost always a bad idea to open a transaction and then wait for I/O while holding it open. You'll get much better performance (and fewer deadlocks) by buffering the data first. If there's more total data than you can easily buffer in memory, buffer it into chunks of a thousand or so rows at a time. Put each of those in a separate transaction if possible.

每当你打开一个事务时,任何被占用的锁在它被提交之前都保持打开状态(并且在你插入数据时无论你是否愿意,锁都会被占用).这些锁会导致其他没有 WITH(NOLOCK) 的更新或读取,等待事务提交.在高性能系统中,如果您在持有这些锁的情况下进行 I/O,那么当其他调用者启动一个操作然后等待该操作在事务外执行 I/O 时,它几乎肯定会导致问题.

Whenever you open a transaction, any locks taken remain open until it is committed (and locks get taken whether you want to or not when you're inserting data). Those locks cause other updates or reads without WITH(NOLOCK) to sit and wait until the transaction is committed. In a high-performance system, if you're doing I/O while those locks are held, it's pretty-much guaranteed to cause problems as other callers start an operation and then sit and wait while this operation does I/O outside the transaction.

这篇关于任务并行库与 async/await 混合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆