VBA Excel大数据处理永无止境 [英] VBA Excel large data manipulation taking forever
问题描述
我有两个Excel文件.
I have two excel files.
第一个excel文件包含人员姓名"和存在的总天数"列 例
First excel file contains the Person Name and Total Days Present column Ex.
PersonName TotalDays
xyz
abcd
另一个Excel文件包含人员姓名",日期"和状态"(当前/不存在).
Another excel file contains Person Name, Date and Status (present/absent).
PersonName Date Status
xyz 1/1/2011 Present
xyz 1/1/2011 Present
我需要将相似的日期状态归为一个,并计数它们以在第一个excel文件中进行更新.
I need to group the similar dates status as one, and count them to update in first excel file.
我在第一个文件中大约有100行,在我需要检查的第二个文件中有20,000行.因此,为了使其更快,我将第二个文件中的所有行加载到Array中,并读取它们以计算每个条目的正确运行情况.
I have around 100 rows in first file where as 20,000 rows in second file from where I need to check. So to make it faster, I loaded all the rows from second file in an Array and reading them to calculate with each entry which works correctly.
问题是,它占用大量内存,因此在Windows中,如此多的应用程序会自动打开,并且系统几乎会挂起.
The issue is, it take large memory so in Windows so many application automatically opens up and system almost hangs.
是否有其他替代方案可以实现此目的而不会出现内存问题和快速处理的问题.我遇到了Scripting.Dictionary,但不确定是否会占用更少的内存.
Is there any alternate to implement this without memory issue and fast processing. I came across Scripting.Dictionary but not sure whether it will take less memory.
编辑 我尝试使用redim保留和20,000大小的静态数组,在两种情况下都会发生相同的问题.
EDIT I tried using redim preserve and static array with 20,000 size, in both case same problem happens.
编辑
lblStatus.Caption = "Loading to memory"
Dim ArrAuditData() As AData
Dim TotalLookUpCount As Integer
For J = 1 To 50000
If lookUpRange.Cells(J, cmbChoice.ListIndex) = "Fail" Then
ReDim Preserve ArrAuditData(J) As AData
ArrAuditData(TotalLookUpCount).AuditType = lookUpRange.Cells(J, cmdAudit2.ListIndex)
ArrAuditData(TotalLookUpCount).TransTime = lookUpRange.Cells(J, cmbChoice.ListIndex - 1)
ArrAuditData(TotalLookUpCount).AuditValue = lookUpRange.Cells(J, cmbChoice.ListIndex)
ArrAuditData(TotalLookUpCount).Slno = lookUpRange.Cells(J, 0)
TotalLookUpCount = TotalLookUpCount + 1
ElseIf lookUpRange.Cells(J, cmbChoice.ListIndex) = "" And J > 4 Then Exit For
End If
DoEvents
Next
推荐答案
由20,000个元素组成的数组,每个元素包含4个Variant,将占用不到2 MB的RAM.我认为内存与您的问题无关-除非您碰巧正在使用具有2 MB RAM或类似内存的旧计算机.
An array of 20,000 elements containing 4 Variants each will take up less than 2 MB of RAM. I don't think memory has anything to do with your problem -- unless you happen to be using an old computer with 2 MB of RAM or something like that.
代码如此繁重的一个更可能的原因是,您在单元格中循环. VBA和Excel工作表数据之间的每次通信都存在大量开销,并且当您一次引用多个单元格时,这会加起来.在您的情况下,您的循环最多可以进行200,000个单独的单元格引用.
A more likely reason why your code is so heavy is that you are looping through cells. There is significant overhead to each communication between VBA and Excel sheet data, and this adds up when you refer to many cells one at a time. In your case, your loop does up to 200,000 separate cell references.
相反,您应该一次将所有数据加载到Variant
数组中,然后遍历该数组,如下所示.这显着提高了速度(即使它使用的是更多内存,而不是更少;不过,我认为内存不是您的问题).
Instead, you should load all your data at once into a Variant
array, and then loop through that array, as shown below. This is significantly faster (even though this uses more memory, not less; but again, I don't think memory is your issue).
lblStatus.Caption = "Loading to memory"
Dim ArrAuditData() As AData
Dim varTemp As Variant
Dim TotalLookUpCount As Integer
' Load everything into a Variant array.
varTemp = lookUpRange
ReDim ArrAuditData(1 To UBound(varTemp, 1)) As AData
For J = 1 To UBound(varTemp, 1)
If varTemp(J, cmbChoice.ListIndex) = "Fail" Then
ArrAuditData(TotalLookUpCount).AuditType = varTemp(J, cmdAudit2.ListIndex)
ArrAuditData(TotalLookUpCount).TransTime = varTemp(J, cmbChoice.ListIndex - 1)
ArrAuditData(TotalLookUpCount).AuditValue = varTemp(J, cmbChoice.ListIndex)
ArrAuditData(TotalLookUpCount).Slno = varTemp(J, 0)
TotalLookUpCount = TotalLookUpCount + 1
ElseIf varTemp(J, cmbChoice.ListIndex) = "" And J > 4 Then
Exit For
End If
DoEvents
Next
ReDim Preserve ArrAuditData(TotalLookUpCount) As AData
为进一步阅读,请看这篇古老但仍有意义的文章: http://www .avdf.com/apr98/art_ot003.html
For further reading, have a look at this old but still relevant article: http://www.avdf.com/apr98/art_ot003.html
如果您仍然认为RAM是问题所在,请向我们展示AData
类型声明.
If you still think RAM is the issue, then please show us the AData
type declaration.
编辑:而且,永远不要在这样的循环内ReDim Preserve
! ReDim Preserve
是一个非常昂贵的操作,很少需要在任何给定阵列上执行多次.这样做20,000次会减慢您的代码速度.在这里,我将其从循环中取出,并在末尾仅使用一次以修剪未使用的元素. (请注意,我最初是如何ReDim
修改数组以适合最大可能数量的元素的.)
EDIT: Also, never ReDim Preserve
inside a loop like that! ReDim Preserve
is a very expensive operation and rarely needs to be done more than once on any given array. Doing it 20,000 times will slow down your code. Here I take it out of the loop, and just use it once at the end to trim off the unused elements. (Notice how I initially ReDim
'ed the array to fit the largest conceivable number of elements.)
这篇关于VBA Excel大数据处理永无止境的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!