从数字字段中删除非数字字符而无需循环 [英] Remove non-numeric characters from numeric fields without loop

查看:78
本文介绍了从数字字段中删除非数字字符而无需循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了一些说明如何执行此操作的主题,这将非常慢.说明在这里: https://www .extendoffice.com/documents/excel/651-excel-remove-non-numeric-characters.html

I have read some topics explaining how to do this, which would be incredibly slow. The explanation is here: https://www.extendoffice.com/documents/excel/651-excel-remove-non-numeric-characters.html

它涉及遍历范围中的每个单元格,然后遍历字段中的字符,如果它们与[0-9]不匹配,则将其删除.

It involves iterating through each cell in a range and then iterating through the characters in the field and removing them if they do not match [0-9].

有什么建议可以更有效地做到这一点?

Any suggestions to do this more efficiently?

想到的是将单元格内容加载到数组中,对其进行迭代,然后将每个条目拆分为自己的数组以进行迭代.

One that comes to mind is loading the cell contents into an array, iterating through it, and splitting each entry into its own array to iterate through.

推荐答案

对于VBA方面(请注意循环),我决定满足自己对几种不同方法的性能的好奇心.他们全都将范围拉成一个数组,并就地进行处理.仅仅由于读取和写入单个单元格值的开销,链接的文章将被其中的任何所杀死.

For the VBA side of things (note the loops), I decided to satisfy my own curiosity about the performance of a couple different methods. All of them pull the range into an array and work on it in place. The linked article will get killed in speed by any of these, simply due to the overhead in reading and writing single cell values.

对于第一种方法,我优化了

For the first method, I optimized the code from the linked article "a bit":

Private Sub MidMethod(values() As Variant)
    Dim r As Long, c As Long, i As Long
    Dim temp As String, output As String

    For r = LBound(values, 1) To UBound(values, 1)
         For c = LBound(values, 2) To UBound(values, 2)
            output = vbNullString
            For i = 1 To Len(values(r, c))
                temp = Mid$(values(r, c), i, 1)
                If temp Like "[0-9]" Then
                    output = output & temp
                End If
            Next
            values(r, c) = output
         Next
    Next
End Sub

对于第二种方法,我使用了RegExp.Replace:

For the second method I used RegExp.Replace:

Private Sub RegexMethod(values() As Variant)
    Dim r As Long, c As Long, i As Long

    With New RegExp
        .Pattern = "[^0-9]"
        .MultiLine = True
        .Global = True
        For r = LBound(values, 1) To UBound(values, 1)
             For c = LBound(values, 2) To UBound(values, 2)
                values(r, c) = .Replace(values(r, c), vbNullString)
             Next
        Next
    End With
End Sub

最后,对于最后一种方法,我使用了Byte数组:

Finally, for the last method I used a Byte array:

Private Sub ByteArrayMethod(values() As Variant)
    Dim r As Long, c As Long, i As Long
    Dim chars() As Byte

    For r = LBound(values, 1) To UBound(values, 1)
         For c = LBound(values, 2) To UBound(values, 2)
            chars = values(r, c)
            values(r, c) = vbNullString
            For i = LBound(chars) To UBound(chars) Step 2
                If chars(i) > 47 And chars(i) < 58 Then
                    values(r, c) = values(r, c) & Chr$(chars(i))
                End If
            Next
         Next
    Next
End Sub

然后我使用此代码对1000个单元进行基准测试,每个单元包含25个字母和数字的随机组合:

Then I used this code to benchmark them against 1000 cells, each containing a random mix of 25 letters and numbers:

Private Sub Benchmark()
    Dim data() As Variant, start As Double, i As Long

    start = Timer
    For i = 1 To 5000
        data = ActiveSheet.Range("A1:J100").Value
        MidMethod data
    Next
    Debug.Print "Mid: " & Timer - start

    start = Timer
    For i = 1 To 5000
        data = ActiveSheet.Range("A1:J100").Value
        RegexMethod data
    Next
    Debug.Print "Regex: " & Timer - start

    start = Timer
    For i = 1 To 5000
        data = ActiveSheet.Range("A1:J100").Value
        ByteArrayMethod data
    Next
    Debug.Print "Byte(): " & Timer - start

End Sub

结果并不令人惊讶-Regex方法是迄今为止最快的 (但我都不称它们为快速"):

The results weren't horribly surprising - the Regex method is by far the fastest (but none of them are what I'd call "fast"):

Mid: 24.3359375 
Regex: 8.31640625 
Byte(): 22.5625

请注意,我不知道这与@SiddharthRout的炫酷公式方法相比如何,因为我无法通过测试工具来运行它. www.extendoffice.com代码也可能仍在运行,因此我没有对其进行测试.

Note that I have no idea how this compares to @SiddharthRout's cool formula method in that I can't run it through my testing harness. The www.extendoffice.com code would also probably still be running, so I didn't test it.

这篇关于从数字字段中删除非数字字符而无需循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆