在 VBA 中获取列的所有唯一值的更快方法? [英] Quicker way to get all unique values of a column in VBA?

查看:49
本文介绍了在 VBA 中获取列的所有唯一值的更快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有更快的方法来做到这一点?

Is there a faster way to do this?

Set data = ws.UsedRange

Set unique = CreateObject("Scripting.Dictionary")

On Error Resume Next
For x = 1 To data.Rows.Count
    unique.Add data(x, some_column_number).Value, 1
Next x
On Error GoTo 0

此时 unique.keys 得到了我需要的东西,但是对于具有数万条记录的文件,循环本身似乎非常慢(而这根本不是问题尤其是像 Python 或 C++ 这样的语言).

At this point unique.keys gets what I need, but the loop itself seems to be very slow for files that have tens of thousands of records (whereas this wouldn't be a problem at all in a language like Python or C++ especially).

推荐答案

加载数组中的值会更快:

Loading the values in an array would be much faster:

Dim data(), dict As Object, r As Long
Set dict = CreateObject("Scripting.Dictionary")

data = ActiveSheet.UsedRange.Columns(1).Value

For r = 1 To UBound(data)
    dict(data(r, some_column_number)) = Empty
Next

data = WorksheetFunction.Transpose(dict.keys())

您还应该考虑对 Scripting.Dictionary 进行早期绑定:

You should also consider early binding for the Scripting.Dictionary:

Dim dict As New Scripting.Dictionary  ' requires `Microsoft Scripting Runtime` '

请注意,使用字典比 Range.AdvancedFilter 在大型数据集上.

Note that using a dictionary is way faster than Range.AdvancedFilter on large data sets.

作为奖励,这里有一个类似于 Range.RemoveDuplicates 从二维数组中删除重复项:

As a bonus, here's a procedure similare to Range.RemoveDuplicates to remove duplicates from a 2D array:

Public Sub RemoveDuplicates(data, ParamArray columns())
    Dim ret(), indexes(), ids(), r As Long, c As Long
    Dim dict As New Scripting.Dictionary  ' requires `Microsoft Scripting Runtime` '

    If VarType(data) And vbArray Then Else Err.Raise 5, , "Argument data is not an array"

    ReDim ids(LBound(columns) To UBound(columns))

    For r = LBound(data) To UBound(data)         ' each row '
        For c = LBound(columns) To UBound(columns)   ' each column '
            ids(c) = data(r, columns(c))                ' build id for the row
        Next
        dict(Join$(ids, ChrW(-1))) = r  ' associate the row index to the id '
    Next

    indexes = dict.Items()
    ReDim ret(LBound(data) To LBound(data) + dict.Count - 1, LBound(data, 2) To UBound(data, 2))

    For c = LBound(ret, 2) To UBound(ret, 2)  ' each column '
        For r = LBound(ret) To UBound(ret)      ' each row / unique id '
            ret(r, c) = data(indexes(r - 1), c)   ' copy the value at index '
        Next
    Next

    data = ret
End Sub

这篇关于在 VBA 中获取列的所有唯一值的更快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆