一维数组的唯一值,无需迭代 [英] Unique values from 1D-array, without iteration

查看:93
本文介绍了一维数组的唯一值,无需迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

冒着成为主题的风险,我决定分享一些代码,

At risk of being of topic, I decided to share some code, Q&A-style. If the general opinion is such that this would be off-topic I'll be happy to delete if need be.

背景

是否可以从任何1D数组或Range对象转换为1D数组中检索所有唯一值,而不必遍历其元素?就我而言,普遍的共识是必须迭代不同的元素,最好的方法是使用字典或集合来存储唯一值.

Can we retrieve all unique values from any 1D-array, or Range object turned into 1D-array, without having to iterate over its elements? As far as I'm concerned the general consensus is that one has to iterate over the different elements, where the best way to do it would either be a dictionary or collection to store unique values in. Here is what I've found works very well for this purpose.

问题

那么如何从一维数组中检索唯一元素,例如:

So how would one go about retrieving unique elements from a 1D-array, for example:

Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")

结果数组将是:

{"A", "C", "D", "E", "G"}

推荐答案

真正需要的所有代码只有几行:

Really all code needed are just a few lines:

Sub test()

Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")
With Application
    uniques = .Index(arr, 1, Filter(.IfError(.Match(.Transpose(.Evaluate("ROW(1:" & UBound(.Match(arr, arr, 0)) & ")")), .Match(arr, arr, 0), 0), "|"), "|", False))
End With

End Sub

上面的代码将返回一维数组,并返回原始数组中的所有唯一元素:

The above will return a 1D-array, returning all unique elements in our original array:

说明:

检索所有这些值的行看起来很密集,所以让我们将其分成几部分:

The line that retrieves all these values looks intense, so let's break it into pieces:

Application.Match能够处理其参数内的数组.所以基本上我们在看:.Match({"A","A","C","D","A","E","G"},{"A","A","C","D","A","E","G"},0).返回的数组将是:{1,1,3,4,1,6,7},这实际上是找到每个值的第一位置.这一结果将成为我们进一步发展的基础.

Application.Match has the ability to work with arrays within it's parameters. So basically we are looking at: .Match({"A","A","C","D","A","E","G"},{"A","A","C","D","A","E","G"},0). The returned array would then be: {1,1,3,4,1,6,7}, and what this really is are the first positions where each value is found at. This result will be the basis on where we build on further.

我们可以在代码中看到第三个.Match,我们需要基本上告诉以下内容:.Match({1,2,3,4,5,6,7},{1,1,3,4,1,6,7},0).第一个参数是上述高亮代码检索的内容.

We can see a third .Match in our code which we need to basically tell the following: .Match({1,2,3,4,5,6,7},{1,1,3,4,1,6,7},0). The first parameter is what is retrieved by the above higlighted code.

.Evaluate("ROW(1:" & UBound(.Match(arr, arr, 0)) & ")")将从1-7返回值的数组,而Application.Transpose则将其返回为一维数组.

Where .Evaluate("ROW(1:" & UBound(.Match(arr, arr, 0)) & ")") will return an array of values from 1-7, the Application.Transpose will return it such that it's a 1D-array.

最后一步将返回一个包含错误的数组,但是由于我们使用的是Application而不是WorksheetFunction,因此代码不会中断.结果数组看起来像{1,Error 2042,3,4,Error 2042,6,7}.现在,最重要的是摆脱Error值.

The last step will return an array holding errors, however the code won't break since we are using Application instead of WorksheetFunction. The resulting array will look like {1,Error 2042,3,4,Error 2042,6,7}. Now the whole point is to get rid of the Error values.

这样做的方法是通过Application.IfError,它将评估数组并将所有错误值更改为给定字符串值.在我们的例子中,我使用了管道符号.由用户决定一个足够独特的符号,该符号将不会出现在原始数组的任何元素中.所以经过评估.我们当前的数组将如下所示:{1,|,3,4,|,6,7}.

The way to do so is through Application.IfError, which will evaluate the array and change all error values into a give string value. In our case I used the pipe symbol. It's up to the user to decide on a symbol unique enough it won't appear in any of the elements in the original array. So after evaluation. Our current array will look like: {1,|,3,4,|,6,7}.

现在,我们检索了一个带有管道符号的数组,我们希望将它们删除!一种快速的方法是使用Filter功能. Filter返回一个包含或不包含符合我们条件的元素的数组(取决于第三个参数中的TRUEFALSE).

Now we retrieved an array with pipe symbols we would want them out! A quick way to do so is with Filter function. Filter returns an array with or without the elements that fit our criteria (depending on the TRUE or FALSE in it's third paramter).

所以基本上我们想返回一个像这样的数组:Filter(<array>, "|", False).生成的一维数组现在看起来像:{1,3,4,6,7}.

So basically we want to return an array like so: Filter(<array>, "|", False). The resulting 1D-array now looks like: {1,3,4,6,7}.

在这一点上,我们已经有了它.我们只需要从原始数组中切出正确的值即可.为此,我们可以使用Application.Index.我们只想告诉.Index我们感兴趣的行.为此,我们可以加载先前找到的1D数组.因此代码如下所示:.Index(arr1, <array>, 1)这将导致一维数组:{"A","C","D","E","G"}

We kind of have it at this point. We just need to slice out the correct values from our original array. To do so we can use Application.Index. We just want to tell .Index which rows we are interested in. And to do so we can load our previously found 1D-array. So the code will look like: .Index(arr1, <array>, 1) which will result in a 1D-array: {"A","C","D","E","G"}

结论:

那里有.一行(不止一个操作)可以从另一个1D数组迭代中检索唯一值的1D数组.此代码已准备好在用arr声明的任何一维数组上使用.

There you have it. One line (with more than just a single operation) to retrieve a 1D-array of unique values from another 1D-array without iteration. This code is ready to be used on any 1D-array declared with arr.

有用吗?我不确定100%,但是我终于达到了我在项目中尝试的目标.生成的数组可立即用于您需要在其中使用唯一值的任何任务.

Usefull? I'm not 100% sure, but I finally reached what I was trying in my project. The resulting array can be used immediately in whichever task you need to use unique values in.

比较:字典与应用程序.方法:

Range(A1:A50000)中的随机项目进行比较,性能确实受到了打击.因此,在1000项步骤中,迭代词典与非迭代Application.Methods方法之间的时间比较.低于1000个项目的结果,每个10000个项目的标记(以秒为单位):

Doing a comparison on random items in the Range(A1:A50000), the performance really takes a hit. Hereby a time-comparison between the iterative Dictionary against the non-iterative Application.Methods approach in 1000 items steps. Below the result of a 1000 items and each 10000 items mark (in seconds):

| Items     | Dictionary    | Methods       |
|-------    |------------   |-------------  |
| 1000      | 0,02          | 0,03          |
| 10000     | 0             | 0,88          |
| 20000     | 0,02          | 3,31          |
| 30000     | 0,02          | 7,3           |
| 40000     | 0,02          | 12,84         |
| 50000     | 0,03          | 20,2          |

使用的Dictionary方法:

Sub Test()

Dim arr As Variant: arr = Application.Transpose(Range("A1:A50000"))
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")

Dim x As Long

For x = LBound(arr) To UBound(arr)
    dict(arr(x)) = 1
Next x

Dim uniques As Variant: uniques = dict.Keys

End Sub

结论:与更常见的Dictionary做法相比,此方法最多可处理1000个项目.在更大的范围内,迭代(通过内存)将始终胜过方法方法!

Conclusion: Up to a 1000 items this method would be about equal in processing time compared to a more common Dictionary practice. On anything larger, iteration (through memory) will always beat the methods approach!

我敢肯定,@ ScottCraner的显示的新动态数组功能会大大限制处理时间.

I'm sure processing time would be more limited with the new Dynamic Array functions as shown by @ScottCraner.

这篇关于一维数组的唯一值,无需迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆