使用VBA过滤功能时,性能注意事项 [英] Performance Considerations when using VBA Filter Function

查看:252
本文介绍了使用VBA过滤功能时,性能注意事项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法弄清楚如何过滤器功能如此之快。我已经使用过滤器的各种数据,不管数据类型,过滤器涂掉任何替代方法,我使用。我经常使用的二进制搜索算法和书面由斯蒂芬·布伦(在QuickArraySort算法专业Excel的发展)。二进制搜索是快如闪电(以最快的速度过滤功能,因为该数组排序)和快速排序算法是已知最快的排序算法之一。

我已经写了一些测试code以下比较发现速度在一个非常大的数组(大小= 200万)的随机元素。我故意填充一个未有序的方式排列(应当指出的是,我曾尝试过各种非有序分配方法,其结果是相似的,无论分配方法)。

 子SearchTest()

昏暗我长,strMyArray()作为字符串,lngSize长,strTest作为字符串
昏暗TimeBinarySearch长,TimeFilterSearch只要
昏暗lngResultBinary长,lngResultFilter只要

昏暗StartHour长,StartMinute长,StartSecond只要
昏暗StartMiliSecond长,开始时间长

昏暗EndHour长,EndMinute长,EndSecond只要
昏暗EndMiliSecond长,结束时间长

    lngSize = 2000000

    strTest = CStr的(1735674 * 987)

    REDIM strMyArray(lngSize)

    对于i = 1到UBound函数(strMyArray)
        如果我mod2 = 0然后
            strMyArray(ⅰ)= CStr的((ⅰ -  1)* 987)
        其他
            strMyArray(ⅰ)= CStr的(第(i + 1)* 987)
        结束如果
    接下来我

''过滤器测试
******************* ******************
    StartHour =小时(NOW())* 60 * 60 * 1000
    StartMinute =分(NOW())* 60 * 1000
    StartSecond =二(NOW())* 1000
    StartMiliSecond =格式(NOW(),MS)

    开始时间= StartHour + StartMinute + StartSecond + StartMiliSecond

    lngResultFilter = CLng函数(筛选(strMyArray,strTest)(0))

    EndHour =小时(NOW())* 60 * 60 * 1000
    EndMinute =分(NOW())* 60 * 1000
    EndSecond =二(NOW())* 1000
    EndMiliSecond =格式(NOW(),MS)

    结束时间= EndHour + EndMinute + EndSecond + EndMiliSecond

    TimeFilterSearch =结束时间 - 开始时间
******************* ******************

''二进制测试
******************* ******************
    StartHour =小时(NOW())* 60 * 60 * 1000
    StartMinute =分(NOW())* 60 * 1000
    StartSecond =二(NOW())* 1000
    StartMiliSecond =格式(NOW(),MS)

    开始时间= StartHour + StartMinute + StartSecond + StartMiliSecond

    QuickSortString1D strMyArray

    lngResultBinary = strMyArray(CLng函数(BinarySearchString(strTest,strMyArray)))

    EndHour =小时(NOW())* 60 * 60 * 1000
    EndMinute =分(NOW())* 60 * 1000
    EndSecond =二(NOW())* 1000
    EndMiliSecond =格式(NOW(),MS)

    结束时间= EndHour + EndMinute + EndSecond + EndMiliSecond

    TimeBinarySearch =结束时间 - 开始时间
******************* ******************

    MSGBOX lngResultFilter和放大器; VBCR和放大器; VBCR和放大器; lngResultBinary

    MSGBOX TimeFilterSearch和放大器; VBCR和放大器; VBCR和放大器; TimeBinarySearch

结束小组
 

这两种方法返回同样的结果,但是筛选方法的返回时间是0毫秒的快速排序/二分查找方法的返回时间是近20秒。这是一个巨大的差别!正如前面提到的,如果数组排序的二进制搜索方法返回一个0毫秒以及(由于大部分都知道,二进制搜索要求数组排序开始)

那么,怎样才能筛选功能去翻2000000未排序的条目并立即找到正确的结果呢?通过每一个条目它不能简单地循环,并与filtervalue比较它(这是迄今为止最慢的方法),但基于关的这些preliminary测试,它不能被利用二进制搜索任一,因为它会有到所述阵列的第一排序。即使是已经编写了一个真棒排序算法,我觉得很难相信,它可以排序的大小大于一百万瞬间的数组。

顺便说一句,下面是快速排序算法和二进制搜索算法。

 子QuickSortString1D(为ByRef saArray()作为字符串_
                可选BYVAL bSortAscending由于布尔=真,_
                可选BYVAL lLow1为Variant,_
                可选BYVAL lHigh1为Variant)

    维变量
    昏暗lLow2只要
    昏暗lHigh2只要
    昏暗SKEY作为字符串
    昏暗sSwap作为字符串

        对错误转到ErrorExit
        如果没有提供,排序整个数组
        如果ISMISSING(lLow1)然后lLow1 = LBOUND(saArray)
        如果ISMISSING(lHigh1)然后lHigh1 = UBound函数(saArray)

        设置新的极端旧极端
        lLow2 = lLow1
        lHigh2 = lHigh1

        获取新的极端中间数组项的值
        SKEY = saArray((lLow1 + lHigh1)\ 2)

        '循环的极端之间的阵列中的所有项
        做,而lLow2< lHigh2

            如果bSortAscending然后
                查找大于中点项中的第一项
                做,而saArray(lLow2)< SKEY而lLow2< lHigh1
                    lLow2 = lLow2 + 1
                循环

                查找小于中点项的最后一个项目
                做,而saArray(lHigh2)> SKEY而lHigh2> lLow1
                    lHigh2 = lHigh2  -  1
                循环
            其他
                查找小于中点项中的第一项
                做,而saArray(lLow2)> SKEY而lLow2< lHigh1
                    lLow2 = lLow2 + 1
                循环

                查找大于中点项目的最后一项
                做,而saArray(lHigh2)< SKEY而lHigh2> lLow1
                    lHigh2 = lHigh2  -  1
                循环

            结束如果

            如果这两个项目都是以错误的顺序,调剂行
            如果lLow2< lHigh2然后
                sSwap = saArray(lLow2)
                saArray(lLow2)= saArray(lHigh2)
                saArray(lHigh2)= sSwap
            结束如果

            如果指针不在一起,前进到下一个项目
            如果lLow2< = lHigh2然后
                lLow2 = lLow2 + 1
                lHigh2 = lHigh2  -  1
            结束如果
        循环

        '递归到极端的下半部排序
        如果lHigh2> lLow1然后
            QuickSortString1D saArray,bSortAscending,lLow1,lHigh2
        结束如果

        '递归到极端的上半部分进行排序
        如果lLow2< lHigh1然后
            QuickSortString1D saArray,bSortAscending,lLow2,lHigh1
        结束如果

    ErrorExit:

    结束小组

    ******************* **********
    评语:使用二进制搜索算法来快速定位
    字符串排序数组中的字符串
    
    参数:sLookFor的字符串数组中搜索
    saArray字符串数组,升序排列
    lMethod无论vbBinaryCompare或vbTextCompare
    默认为vbTextCompare
    lNotFound要返回的值,如果文本没有
    发现。默认为-1
    
    返回:长的位置位于阵列中,
    或者lNotFound如果未找到
    
    日期开发行动
    --------------------------------
    04年6月2日斯蒂芬·布伦创建
    
    功能BinarySearchString(为ByRef sLookFor作为字符串_
                为ByRef saArray()作为字符串_
                可选BYVAL lMethod作为VbCompareMethod = vbTextCompare,_
                可选BYVAL lNotFound只要= -1)只要

    昏暗lLow只要
    昏暗LMID只要
    昏暗lHigh只要
    昏暗LCOMP只要

        对错误转到ErrorExit

        '假设我们没有发现它
        BinarySearchString = lNotFound

        获取起始位置
        lLow = LBOUND(saArray)
        lHigh = UBound函数(saArray)

        做
            查找数组的中点
            LMID =(lLow + lHigh)\ 2

            中点元素进行比较,以被搜索的字符串的
            LCOMP = STRCOMP(saArray(LMID),sLookFor,lMethod)

            如果LCOMP = 0则
                我们发现它,所以返回的位置和退出
                BinarySearchString = LMID
                退出待办事项
            elseif的LCOMP = 1,则
                中点产品比我们大 - 扔掉的上半部分
                lHigh = LMID  -  1
            其他
                中点产品比我们小 - 扔掉下半区
                lLow = LMID + 1
            结束如果

            继续,直到我们的三分球越过
        循环直到lLow> lHigh

    ErrorExit:

    端功能
 

编辑:看来,我应该首先做了一些野蛮的力量测试。通过简单地遍历数组以线性方式约翰·科尔曼提出的过滤功能执行,返回时间相同的情况下上述0毫秒。请看下图:

 子Test3的()

昏暗我长,strMyArray()作为字符串,lngSize长,strTest作为字符串
昏暗lngResultBrute长,TimeBruteSearch只要

    lngSize = 2000000
    strTest = CStr的(936740 * 97)
    REDIM strMyArray(lngSize)

    对于i = 1到UBound函数(strMyArray)
        如果我mod2 = 0然后
            strMyArray(I)= CStr的((I  -  1)* 97)
        其他
            strMyArray(ⅰ)= CStr的(第(i + 1)* 97)
        结束如果
    接下来我

    开始时间=计时器

    蛮力搜索
    对于i = 1到UBound函数(strMyArray)
        如果strMyArray(ⅰ)= strTest然后
            lngResultBrute = CLng函数(strTest)
            退出对于
        结束如果
    接下来我

    结束时间=计时器

    TimeBruteSearch =结束时间 - 开始时间
    MSGBOX TimeBruteSearch

结束小组
 

解决方案

过滤器并使用线性搜索 - 它只是执行它减轻快,因为它是在高度实施优化的C / C ++ code。看到这一点,运行以下code:

 功能RandString(正长)作为字符串
    返回一个随机字符串B-Z
    昏暗我只要
    昏暗的参考译文字符串
    对于i = 1到n
        S = S&放大器; CHR(66 + INT(25 *的Rnd()))
    接下来我
    RandString = S
端功能

子测试()
    昏暗的时间(1〜20)作为双
    昏暗我长,N当
    暗淡了()作为字符串
    昏暗的开端,双
    昏暗的参考译文字符串
    随机
    S = RandString(99)
    REDIM A(1 200万)
    对于i = 1 200万
        A(I)= S + RandString(1)
    接下来我
    S = S&放大器; 一个
    对于i = 20至1步骤-1
        N = I * 100000
        REDIM preserve A(1到N)
        启动=计时器
        Debug.Print UBound函数(过滤器(A,S))应为-1
        次(I)=定时器 - 启动
    接下来我
    对于i = 1到20
        细胞(ⅰ,1)= I
        细胞(1,2)=倍(ⅰ)
    接下来我
结束小组
 

此code创建的2000000随机串长度100,其每一个的不同之处的最后一个位置目标字符串的数组。然后将其送至子阵列,其尺寸分别为10的倍数为过滤器,定时所花费的时间。输出看起来是这样的:

输入图像的描述在这里

清晰的线性趋势并不完全证明,但强有力的证据表明VBA的过滤器正在执行一个简单的线性搜索。

I can't figure out how the Filter function works so fast. I have used Filter on all sorts of data and regardless of data-type, Filter obliterates any alternative method I employ. I regularly use the Binary search algorithm and the QuickArraySort algorithm written by Stephen Bullen (found in Professional Excel Development). The Binary Search is lightning fast (as fast as the Filter function, given that the array is sorted) and the Quick Sort algorithm is one of the fastest sorting algorithms known.

I have written some test code below comparing speeds of finding a random element in a very large array (size = 2,000,000). I intentionally populate the array in an un-ordered fashion (it should be noted that I have tried various un-ordered assigning methods, and the results are similar regardless of assigning method).

Sub SearchTest()

Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim TimeBinarySearch As Long, TimeFilterSearch As Long
Dim lngResultBinary As Long, lngResultFilter As Long

Dim StartHour As Long, StartMinute As Long, StartSecond As Long
Dim StartMiliSecond As Long, StartTime As Long

Dim EndHour As Long, EndMinute  As Long, EndSecond As Long
Dim EndMiliSecond As Long, EndTime As Long

    lngSize = 2000000

    strTest = CStr(1735674 * 987)

    ReDim strMyArray(lngSize)

    For i = 1 To UBound(strMyArray)
        If i Mod 2 = 0 Then
            strMyArray(i) = CStr((i - 1) * 987)
        Else
            strMyArray(i) = CStr((i + 1) * 987)
        End If
    Next i

''Filter Test
'*******************************************************************
    StartHour = Hour(Now()) * 60 * 60 * 1000
    StartMinute = Minute(Now()) * 60 * 1000
    StartSecond = Second(Now()) * 1000
    StartMiliSecond = Format(Now(), "ms")

    StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond

    lngResultFilter = CLng(Filter(strMyArray, strTest)(0))

    EndHour = Hour(Now()) * 60 * 60 * 1000
    EndMinute = Minute(Now()) * 60 * 1000
    EndSecond = Second(Now()) * 1000
    EndMiliSecond = Format(Now(), "ms")

    EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond

    TimeFilterSearch = EndTime - StartTime
'*******************************************************************

''Binary Test
'*******************************************************************
    StartHour = Hour(Now()) * 60 * 60 * 1000
    StartMinute = Minute(Now()) * 60 * 1000
    StartSecond = Second(Now()) * 1000
    StartMiliSecond = Format(Now(), "ms")

    StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond

    QuickSortString1D strMyArray

    lngResultBinary = strMyArray(CLng(BinarySearchString(strTest, strMyArray)))

    EndHour = Hour(Now()) * 60 * 60 * 1000
    EndMinute = Minute(Now()) * 60 * 1000
    EndSecond = Second(Now()) * 1000
    EndMiliSecond = Format(Now(), "ms")

    EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond

    TimeBinarySearch = EndTime - StartTime
'*******************************************************************

    MsgBox lngResultFilter & vbCr & vbCr & lngResultBinary 

    MsgBox TimeFilterSearch & vbCr & vbCr & TimeBinarySearch

End Sub

Both methods return the same result, however the Filter method's return time is 0 ms and the QuickSort/BinarySearch method's return time is nearly 20 seconds. That is a huge difference!! As mentioned earlier, if the array is sorted the binary search method returns 0 ms as well (As most know, binary search requires that the array is sorted to begin with)

So, how can the Filter function look through 2,000,000 un-sorted entries and find the correct result immediately? It can't simply loop through every entry and compare it with the filtervalue (this is by far the slowest method), but based off of these preliminary test, it can't be utilizing a binary search either, as it would have to sort the array first. Even if there was an awesome sorting algorithm that was already compiled, I find it hard to believe that it could sort an array of size greater than a million instantaneously.

By the way, below is the QuickSort algorithm and the Binary Search algorithm.

    Sub QuickSortString1D(ByRef saArray() As String, _
                Optional ByVal bSortAscending As Boolean = True, _
                Optional ByVal lLow1 As Variant, _
                Optional ByVal lHigh1 As Variant)

    'Dimension variables
    Dim lLow2 As Long
    Dim lHigh2 As Long
    Dim sKey As String
    Dim sSwap As String

        On Error GoTo ErrorExit
        'If not provided, sort the entire array
        If IsMissing(lLow1) Then lLow1 = LBound(saArray)
        If IsMissing(lHigh1) Then lHigh1 = UBound(saArray)

        'Set new extremes to old extremes
        lLow2 = lLow1
        lHigh2 = lHigh1

        'Get value of array item in middle of new extremes
        sKey = saArray((lLow1 + lHigh1) \ 2)

        'Loop for all the items in the array between the extremes
        Do While lLow2 < lHigh2

            If bSortAscending Then
                'Find the first item that is greater than the mid-point item
                Do While saArray(lLow2) < sKey And lLow2 < lHigh1
                    lLow2 = lLow2 + 1
                Loop

                'Find the last item that is less than the mid-point item
                Do While saArray(lHigh2) > sKey And lHigh2 > lLow1
                    lHigh2 = lHigh2 - 1
                Loop
            Else
                'Find the first item that is less than the mid-point item
                Do While saArray(lLow2) > sKey And lLow2 < lHigh1
                    lLow2 = lLow2 + 1
                Loop

                'Find the last item that is greater than the mid-point item
                Do While saArray(lHigh2) < sKey And lHigh2 > lLow1
                    lHigh2 = lHigh2 - 1
                Loop

            End If

            'If the two items are in the wrong order, swap the rows
            If lLow2 < lHigh2 Then
                sSwap = saArray(lLow2)
                saArray(lLow2) = saArray(lHigh2)
                saArray(lHigh2) = sSwap
            End If

            'If the pointers are not together, advance to the next item
            If lLow2 <= lHigh2 Then
                lLow2 = lLow2 + 1
                lHigh2 = lHigh2 - 1
            End If
        Loop

        'Recurse to sort the lower half of the extremes
        If lHigh2 > lLow1 Then
            QuickSortString1D saArray, bSortAscending, lLow1, lHigh2
        End If

        'Recurse to sort the upper half of the extremes
        If lLow2 < lHigh1 Then
            QuickSortString1D saArray, bSortAscending, lLow2, lHigh1
        End If

    ErrorExit:

    End Sub

    '***********************************************************
    ' Comments: Uses a binary search algorithm to quickly locate
    ' a string within a sorted array of strings
    '
    ' Arguments: sLookFor The string to search for in the array
    ' saArray An array of strings, sorted ascending
    ' lMethod Either vbBinaryCompare or vbTextCompare
    ' Defaults to vbTextCompare
    ' lNotFound The value to return if the text isn’t
    ' found. Defaults to -1
    '
    ' Returns: Long The located position in the array,
    ' or lNotFound if not found
    '
    ' Date Developer Action
    ' ———————————————————————————————-
    ' 02 Jun 04 Stephen Bullen Created
    '
    Function BinarySearchString(ByRef sLookFor As String, _
                ByRef saArray() As String, _
                Optional ByVal lMethod As VbCompareMethod = vbTextCompare, _
                Optional ByVal lNotFound As Long = -1) As Long

    Dim lLow As Long
    Dim lMid As Long
    Dim lHigh As Long
    Dim lComp As Long

        On Error GoTo ErrorExit

        'Assume we didn’t find it
        BinarySearchString = lNotFound

        'Get the starting positions
        lLow = LBound(saArray)
        lHigh = UBound(saArray)

        Do
            'Find the midpoint of the array
            lMid = (lLow + lHigh) \ 2

            'Compare the mid-point element to the string being searched for
            lComp = StrComp(saArray(lMid), sLookFor, lMethod)

            If lComp = 0 Then
                'We found it, so return the location and quit
                BinarySearchString = lMid
                Exit Do
            ElseIf lComp = 1 Then
                'The midpoint item is bigger than us - throw away the top half
                lHigh = lMid - 1
            Else
                'The midpoint item is smaller than us - throw away the bottom half
                lLow = lMid + 1
            End If

            'Continue until our pointers cross
        Loop Until lLow > lHigh

    ErrorExit:

    End Function

Edit: It seems I should have done some "brute" force tests first. By simply looping through the array in a linear fashion as John Coleman suggests the Filter function performs, the return time for the same scenario above is 0 ms. See below:

Sub Test3()

Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim lngResultBrute As Long, TimeBruteSearch As Long

    lngSize = 2000000
    strTest = CStr(936740 * 97)
    ReDim strMyArray(lngSize)

    For i = 1 To UBound(strMyArray)
        If i Mod 2 = 0 Then
            strMyArray(i) = CStr((i - 1) * 97)
        Else
            strMyArray(i) = CStr((i + 1) * 97)
        End If
    Next i

    StartTime = Timer

    ' Brute force search
    For i = 1 To UBound(strMyArray)
        If strMyArray(i) = strTest Then
            lngResultBrute = CLng(strTest)
            Exit For
        End If
    Next i

    EndTime = Timer

    TimeBruteSearch = EndTime - StartTime
    MsgBox TimeBruteSearch

End Sub

解决方案

Filter does use a linear search -- it just executes it lightening quick because it is implemented in highly optimized C/C++ code. To see this, run the following code:

Function RandString(n As Long) As String
    'returns a random string in B-Z
    Dim i As Long
    Dim s As String
    For i = 1 To n
        s = s & Chr(66 + Int(25 * Rnd()))
    Next i
    RandString = s
End Function

Sub test()
    Dim times(1 To 20) As Double
    Dim i As Long, n As Long
    Dim A() As String
    Dim start As Double
    Dim s As String
    Randomize
    s = RandString(99)
    ReDim A(1 To 2000000)
    For i = 1 To 2000000
        A(i) = s + RandString(1)
    Next i
    s = s & "A"
    For i = 20 To 1 Step -1
        n = i * 100000
        ReDim Preserve A(1 To n)
        start = Timer
        Debug.Print UBound(Filter(A, s)) 'should be -1
        times(i) = Timer - start
    Next i
    For i = 1 To 20
        Cells(i, 1) = i
        Cells(i, 2) = times(i)
    Next i
End Sub

This code creates an array of 2,000,000 random strings of length 100, each of which differs from the target string in the last position. Then it feeds subarrays whose sizes are multiples of 100,000 into Filter, timing the time it takes. The output looks like this:

The clear linear trend doesn't exactly prove but is strong evidence that VBA's Filter is executing a straightforward linear search.

这篇关于使用VBA过滤功能时,性能注意事项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆