SQLDataReader如何处理非常大的查询? [英] How does SQLDataReader handle really large queries?

查看:28
本文介绍了SQLDataReader如何处理非常大的查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

实际上,我不确定标题是否准确地描述了问题,但我希望它足够接近.

Actually I'm not sure the title accurately describes the question, but I hope it is close enough.

我有一些代码可以从数据库表中执行SELECT,我知道这将导致大约150万行被选中.每行中的数据并不大-也许每行20个字节.但这仍然是30MB的数据.每行包含一个客户编号,我需要对每个客户进行一些操作.

I have some code that performs a SELECT from a database table that I know will result in about 1.5 million rows being selected. The data in each row isn't large - maybe 20 bytes per row. But that's still 30MB of data. Each row contains a customer number, and I need to do something with each customer.

我的代码如下:

SqlConnection conn = new SqlConnection(connString);
SqlCommand command = new SqlCommand("SELECT ... my select goes here", conn);
using (conn)
{
    conn.Open();
    using (SqlDataReader reader = command.ExecuteReader())
    {
        while(reader.Read())
        {
            ... process the customer number here
        }
    }
}

所以我只遍历SELECT返回的所有客户.

So I just iterate over all the customers returned by the SELECT.

我的问题是,这是否导致对数据库的多次读取,或者只是一次读取?我假设网络缓冲区不足以容纳30MB数据,那么.NET在这里做什么?SELECT的结果是否存储在某个地方,以便每次Read()前进指针时,SQLDataReader都能将行蚕食?还是回到数据库?

My question is, does that result in multiple reads of the database, or just one? I assume the network buffers aren't big enough to hold 30MB of data, so what does .NET do here? Is the result of the SELECT squirreled away somewhere for the SQLDataReader to nibble off a row every time Read() advances the pointer? Or does it go back to the database?

我要问的原因是代码的"...在此处处理客户编号"部分可能需要一些时间,因此,对于150万个客户而言,代码(上面的while循环)将需要许多小时才能完成.发生这种情况时,我是否需要担心其他人在数据库上挡住了我,还是我已经从数据库中执行了一次SELECT而又不回头呢?我是安全的吗?

The reason I'm asking is that the "... process the customer number here" part of the code can take some time, so for 1.5 million customers that code (the while loop above) will take many hours to complete. While that's happening, do I need to worry about other people blocking behind me on the database, or am I safe in the knowledge that I've done my one SELECT from the database and I'm not going back again?

推荐答案

选择将作为单个整体交易"执行.当协议确定存在可用于接收输出的缓冲区时,输出的余额将缓存在SQL Server中,并传递到网络.但是,SQL Server不会每次都回到数据表中.原始 SELECT 传递到该点时的数据状态将返回到您的应用程序.如果指定(NOLOCK),则对数据没有任何进一步的影响.其他人可以阅读&写下来;您将看不到他们的更改.但是,直到数小时后,最后一行才出现在应用程序服务器的缓冲区中,您才可以使用SQL Server.每个网络上都会有网络流量,请给我更多的空间",但是不会比一次全部30MB的流量大得多.

The select will be executed as a "single, monolithic transaction". The balance of the output is cached in SQL Server and passed out to the network as the protocol determines there is buffer available to receive it. SQL Server will not go back into the data tables each time, though. The state of the data at the point the original SELECT passed over it will be returned to your application. If you have (NOLOCK) specified you will have no further impact on the data. Other people can read & write it; you will not see their changes. You have not finished with SQL Server, however, until the last row is in your app server's buffers, hours later. There will be network traffic at each "I have room for more now, please" but not noticeably more than had the whole 30MB come across all at once.

对于大型结果集和长时间运行的流程,即使基础架构可以支持完整的查询输出,也最好编写应用程序以批量处理数据.回答每个批处理查询所需的资源较少.万一发生故障,您只需要处理剩余的行即可;您不必从头开始.您的应用程序最终将在整体上做更多的工作,但是每个块对环境的破坏都较小.

With large result sets and long-running processes you are better to write your application to process data in batches, even if the infrastructure can support the full query output. It takes fewer resources to answer each batched query. In the case of failure you need only process the remaining rows; you do not have to start again from the beginning. Your application will end up doing fractionally more work overall but each chunk will be less disruptive to the environment.

这篇关于SQLDataReader如何处理非常大的查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆