SqlDataReader 不返回所有记录(第三次尝试) [英] SqlDataReader does not return all records (3rd attempt)

查看:24
本文介绍了SqlDataReader 不返回所有记录(第三次尝试)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我之前曾两次尝试为这个问题找到解决方案,但不幸的是,这些答案并没有提供永久修复,所以我在这里,再试一次.

I've tried to find solution for this problem twice before, but unfortunately those answers haven't provided permanent fix, so here I am, giving it another try.

我有一个 SQL Server 存储过程,它返回 150 万个整数 ID 的列表.我从 ASP.NET/VB.NET 代码调用这个 SP 并执行一个 SqlDataReader:

I have an SQL Server stored procedure that returns list of 1.5 million integer IDs. I am calling this SP from ASP.NET/VB.NET code and executing a SqlDataReader:

m_dbSel.CommandType = CommandType.StoredProcedure
m_dbSel.CommandText = CstSearch.SQL.SP_RS_SEARCH_EX
oResult = m_dbSel.ExecuteReader(CommandBehavior.CloseConnection)

然后我将该读取器传递给类构造函数以构建通用列表(整数).代码非常基础:

Then I am passing that reader to a class constructor to build Generic List(Of Integer). The code is very basic:

Public Sub New(i_oDataReader As Data.SqlClient.SqlDataReader)

    m_aFullIDList = New Generic.List(Of Integer)

    While i_oDataReader.Read
        m_aFullIDList.Add(i_oDataReader.GetInt32(0))
    End While

    m_iTotalNumberOfRecords = m_aFullIDList.Count

End Sub

问题是 - 这不会读取所有 150 万条记录,数量不一致,最终计数可能是 500K 或 100 万等(最常见的是 524289 条记录的神奇"数量被退回).我在执行命令时尝试使用 CommandBehavior.SequentialAccess 设置,但结果也不一致.

The problem is - this doesn't read all 1.5 million of records, the number is inconsistent, final count could be 500K or 1 million etc. (Most often "magic" number of 524289 records is returned). I've tried using CommandBehavior.SequentialAccess setting when executing command, but the results turned out to be inconsistent as well.

当我在 SSMS 中运行 SP 时,它几乎立即返回一定数量的记录并显示它们,但然后继续运行几秒钟,直到所有 150 万条记录都完成 - 是否与此有关?

When I am running SP in SSMS, it returns certain number of records almost right away and displays them, but then continues to run for a few seconds more until all 1.5 million records are done - does it have anything to do with this?

更新


一段时间后,我发现在非常罕见的情况下,上面的循环代码确实会抛出异常:

UPDATE


After a while I found that on very-very rare occasions the loop code above does throw an exception:

System.NullReferenceException: 未将对象引用设置为实例的一个对象.在System.Data.SqlClient.SqlDataReader.ReadColumnHeader(Int32 i)

System.NullReferenceException: Object reference not set to an instance of an object. at System.Data.SqlClient.SqlDataReader.ReadColumnHeader(Int32 i)

所以确实发生了一些内部故障.看起来如果我更换

So some internal glitch does happen. Also it looks like if I replace

While i_oDataReader.Read
  m_aFullIDList.Add(i_oDataReader.GetInt32(0))
End While

处理整数

While i_oDataReader.Read
   m_aFullIDList.Add(i_oDataReader(0))
End While

处理对象 - 代码似乎运行时没有出现故障并返回所有记录.

that deals in Objects - the code seems to run without a glitch and returns all records.

去图.

推荐答案

基本上,正如我们在评论中指出的那样(*),问题不在于存储过程 SqlDataRead,或 SQL 等.相反,您的 List.Add 失败了,因为它无法为 2^(n+1) 个项目分配额外的内存来扩展 List 并将您现有的 2^n 个项目复制到其中.大多数情况下,您的 n=19(即 524289 项),但有时可能更高.

Basically, as we've flogged out in the comments(*), the problem isn't with SqlDataRead, the stored procedure, or SQL at al. Rather, your List.Add is failing because it cannot allocate the additional memory for 2^(n+1) items to extend the List and copy your existing 2^n items into. Most of the time your n=19 (so 524289 items), but sometimes it could be higher.

您可以为此做三件事:

  1. 预分配:正如您所发现的,通过预分配,您应该能够在任何地方增加 1.5 到 3 倍的项目.如果您提前知道您将拥有多少项目,这将最有效,因此我建议您提前执行 SELECT COUNT(*).. 或添加 COUNT(*) OVER(PARTITION BY 1) 列并从返回的第一行中挑选出来以预先分配列表.这种方法的问题在于您仍然非常接近您的极限,并且在不久的将来很容易耗尽内存......

  1. Pre-Allocate: As you've discovered, by pre-allocating you should be able to gwet anywhere from 1.5 to 3 times as many items. This works best if you know ahead of time how many items you'll have, so I'd recommend either excuting a SELECT COUNT(*).. ahead of time, or adding a COUNT(*) OVER(PARTITION BY 1) column and picking it out of the first row returned to pre-allocate the List. The problem with this approach is that you're still pretty close to your limit and could easily run out of memory in the near future...

重新配置:现在您最多只能为此获得 2^22 字节的内存,而理论上您应该能够获得 2^29-2^30 左右的内存.这意味着您机器上的某些东西阻止您将可写虚拟内存限制扩展到那么高.可能的原因包括页面文件的大小和来自其他进程的竞争(但还有其他可能性).解决这个问题,您应该有足够的空间.

Re-Configure: Right now you are only getting at most 2^22 bytes of memory for this, when in theory you shoud be able to get around 2^29-2^30. That means that something on your machine is preventing you from extending your writeable Virtual Memory limit that high. Likely causes include the size of your pagefile and competition from other processes (but there are other possibilities). Fix that and you should have more than enough headroom for this.

流媒体:您真的需要同时在内存中存储 150 万个项目吗?如果不是,并且您可以即时确定哪些不需要(或提取您确实需要的信息),那么您可以使用与 SqlDataReader 相同的方式通过流式处理来解决此问题.只需阅读一行,使用它,然后丢失它并继续下一行.

Streaming: Do you really need all 1.5 million items in memory at the same time? If not and you can determine which you don't need (or extract the info that you do need) on the fly, then you can solve this problem the same way that SqlDataReader does, with streaming. Just read a row, use it, then lose it and go on to the next row.

希望这会有所帮助.

(* -- 显然,感谢@granadaCoder 和@MartinSmith)

(* -- Thanks, obviously, to @granadaCoder and @MartinSmith)

如果您真的认为问题仅在于 List 数据结构(并且不是您只是内存不足),那么还有一些其他方法可以解决 List 结构的分配问题行为.一种方法是实现一个替代 List 类(如 IList(of Integer)).

If you really think that the problem rests solely with the List data structure (and not that you are just running out of memory), then there are some other ways to work around the List structure's allocation behavior. One way would be to implement an alternative List class (as IList(of Integer)).

通过接口,它看起来与 List 相同,但在内部它会有不同的分配方案,通过将数据存储在嵌套的 List(of List(of Integer)) 中.每 1000 个项目,它会创建一个新的 List(of Integer),将其添加到父嵌套列表中,然后使用它添加接下来的 1000 个项目.

Through the interface it would appear the same as List but internally it would have a different allocation scheme, by storing the data in a nested List(of List(of Integer)). Every 1000 items, it would create a new List(of Integer), add it to the parent nested list and then use it to add in the next 1000 items.

我之前不建议这样做的原因是因为,像预分配一样,这可能会让你更接近你的内存限制,但是,如果这是问题,你最终还是会用完(只是与预分配一样),因为此限制太接近您需要的实际项目数(150 万).

The reason that I didn't suggest this before is because, like pre-allocation, this may allow you to get closer to your memory limit, but, if that's the problem, you are still going to run out eventually (just as with pre-allocating) because this limit is too close to the actual number of items that you need (1.5 million).

这篇关于SqlDataReader 不返回所有记录(第三次尝试)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆