VBA Excel 大数据操作需要永远 [英] VBA Excel large data manipulation taking forever
问题描述
我有两个excel文件.
I have two excel files.
第一个 excel 文件包含人员姓名和总天数列例如
First excel file contains the Person Name and Total Days Present column Ex.
PersonName TotalDays
xyz
abcd
另一个 excel 文件包含人员姓名、日期和状态(存在/不存在).
Another excel file contains Person Name, Date and Status (present/absent).
PersonName Date Status
xyz 1/1/2011 Present
xyz 1/1/2011 Present
我需要将相似的日期状态分组为一个,并计算它们在第一个 excel 文件中更新.
I need to group the similar dates status as one, and count them to update in first excel file.
我在第一个文件中有大约 100 行,而在我需要检查的第二个文件中有 20,000 行.因此,为了使其更快,我将第二个文件中的所有行加载到一个数组中并读取它们以计算每个正常工作的条目.
I have around 100 rows in first file where as 20,000 rows in second file from where I need to check. So to make it faster, I loaded all the rows from second file in an Array and reading them to calculate with each entry which works correctly.
问题是,它需要很大的内存,所以在 Windows 中,很多应用程序自动打开,系统几乎挂起.
The issue is, it take large memory so in Windows so many application automatically opens up and system almost hangs.
是否有任何替代方案可以在没有内存问题和快速处理的情况下实现这一点.我遇到了 Scripting.Dictionary,但不确定它是否会占用更少的内存.
Is there any alternate to implement this without memory issue and fast processing. I came across Scripting.Dictionary but not sure whether it will take less memory.
编辑我尝试使用大小为 20,000 的 redim 保留和静态数组,在这两种情况下都会发生相同的问题.
EDIT I tried using redim preserve and static array with 20,000 size, in both case same problem happens.
编辑
lblStatus.Caption = "Loading to memory"
Dim ArrAuditData() As AData
Dim TotalLookUpCount As Integer
For J = 1 To 50000
If lookUpRange.Cells(J, cmbChoice.ListIndex) = "Fail" Then
ReDim Preserve ArrAuditData(J) As AData
ArrAuditData(TotalLookUpCount).AuditType = lookUpRange.Cells(J, cmdAudit2.ListIndex)
ArrAuditData(TotalLookUpCount).TransTime = lookUpRange.Cells(J, cmbChoice.ListIndex - 1)
ArrAuditData(TotalLookUpCount).AuditValue = lookUpRange.Cells(J, cmbChoice.ListIndex)
ArrAuditData(TotalLookUpCount).Slno = lookUpRange.Cells(J, 0)
TotalLookUpCount = TotalLookUpCount + 1
ElseIf lookUpRange.Cells(J, cmbChoice.ListIndex) = "" And J > 4 Then Exit For
End If
DoEvents
Next
推荐答案
包含 4 个 Variants 的 20,000 个元素的数组每个将占用不到 2 MB 的 RAM.我认为内存与您的问题没有任何关系——除非您碰巧使用的是具有 2 MB RAM 或类似内容的旧计算机.
An array of 20,000 elements containing 4 Variants each will take up less than 2 MB of RAM. I don't think memory has anything to do with your problem -- unless you happen to be using an old computer with 2 MB of RAM or something like that.
您的代码如此繁重的一个更可能的原因是您正在遍历单元格.VBA 和 Excel 工作表数据之间的每次通信都会产生大量开销,当您一次引用多个单元格时,这种开销会增加.在您的情况下,您的循环最多可进行 200,000 个单独的单元格引用.
A more likely reason why your code is so heavy is that you are looping through cells. There is significant overhead to each communication between VBA and Excel sheet data, and this adds up when you refer to many cells one at a time. In your case, your loop does up to 200,000 separate cell references.
相反,您应该一次将所有数据加载到 Variant
数组中,然后循环遍历该数组,如下所示.这明显更快(即使这使用更多内存,而不是更少;但同样,我不认为内存是你的问题).
Instead, you should load all your data at once into a Variant
array, and then loop through that array, as shown below. This is significantly faster (even though this uses more memory, not less; but again, I don't think memory is your issue).
lblStatus.Caption = "Loading to memory"
Dim ArrAuditData() As AData
Dim varTemp As Variant
Dim TotalLookUpCount As Integer
' Load everything into a Variant array.
varTemp = lookUpRange
ReDim ArrAuditData(1 To UBound(varTemp, 1)) As AData
For J = 1 To UBound(varTemp, 1)
If varTemp(J, cmbChoice.ListIndex) = "Fail" Then
ArrAuditData(TotalLookUpCount).AuditType = varTemp(J, cmdAudit2.ListIndex)
ArrAuditData(TotalLookUpCount).TransTime = varTemp(J, cmbChoice.ListIndex - 1)
ArrAuditData(TotalLookUpCount).AuditValue = varTemp(J, cmbChoice.ListIndex)
ArrAuditData(TotalLookUpCount).Slno = varTemp(J, 0)
TotalLookUpCount = TotalLookUpCount + 1
ElseIf varTemp(J, cmbChoice.ListIndex) = "" And J > 4 Then
Exit For
End If
DoEvents
Next
ReDim Preserve ArrAuditData(TotalLookUpCount) As AData
要进一步阅读,请查看这篇旧但仍然相关的文章:http://www.avdf.com/apr98/art_ot003.html
For further reading, have a look at this old but still relevant article: http://www.avdf.com/apr98/art_ot003.html
如果您仍然认为 RAM 是问题,那么请向我们展示 AData
类型声明.
If you still think RAM is the issue, then please show us the AData
type declaration.
EDIT:另外,永远不要在这样的循环中ReDim Preserve
!ReDim Preserve
是一项非常昂贵的操作,很少需要对任何给定数组执行多次.这样做 20,000 次会减慢您的代码速度.在这里,我将它从循环中取出,并在最后使用它一次以修剪掉未使用的元素.(请注意我最初是如何ReDim
对数组进行调整以适应最大数量的元素的.)
EDIT: Also, never ReDim Preserve
inside a loop like that! ReDim Preserve
is a very expensive operation and rarely needs to be done more than once on any given array. Doing it 20,000 times will slow down your code. Here I take it out of the loop, and just use it once at the end to trim off the unused elements. (Notice how I initially ReDim
'ed the array to fit the largest conceivable number of elements.)
这篇关于VBA Excel 大数据操作需要永远的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!