优化复制过滤数据 [英] Optimize copying filtered data

查看:98
本文介绍了优化复制过滤数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张约有10万行和40列的表.

I have a table with around 100k rows and 40 columns.

我需要将一些行复制到另一个基于工作簿的工作簿中,该工作簿具有与列值匹配的字符串.

I need to copy some of the rows to another workbook based an array with strings that match column values.

cond_list = ["value1", "value2", "value3" ...]

此条件可以匹配5k行或更多.

This condition can match 5k rows or more.

我尝试了一个简单的解决方案,以使用自动筛选"并复制可见的单元格:

I tried a simple solution to use AutoFilter and copy visible cells:

' Filter source data
src_wks.ListObjects("Table1").Range.AutoFilter _
  Field:=src_wks.ListObjects("Table1").ListColumns("Column1").Index, _
  Criteria1:=cond_list, Operator:=xlFilterValues
        
' Copy and paste
src_wks.UsedRange.SpecialCells(xlCellTypeVisible).Copy
dst_wks.Range("A1").PasteSpecial Paste:=xlPasteValues

过滤需要花费一秒钟的时间,但是此行的执行要花费10分钟以上.我必须像运行20次这样的代码,所以它是不可接受的.

Filtering takes a fraction of a second, but then execution of this line takes more than 10 minutes. I have to run this code like 20 times so it is unacceptable.

src_wks.UsedRange.SpecialCells(xlCellTypeVisible).Copy

我试图按照以下注释修改代码: https://stackoverflow.com/a/22789329/7214068

I tried to modify the code following this comment: https://stackoverflow.com/a/22789329/7214068

我尝试先复制整个数据,然后删除隐藏的行:

I tried to copy whole data first and then remove hidden rows:

' Copy and Paste whole table
dst_wks.UsedRange.Offset(1, 0).Value = ""
addr = src_wks.UsedRange.Address
dst_wks.Range(addr).Value = src_wks.UsedRange.Value

' Filter data
dst_wks.ListObjects("Table1").Range.AutoFilter _
  Field:=dst_wks.ListObjects("Table1").ListColumns("Column1").Index, _
  Criteria1:=cond_list, Operator:=xlFilterValues

' Remove rest
Application.DisplayAlerts = False ' Suppress "delete row?" promt
Dim i, numRows As Long
numRows = dst_wks.UsedRange.Rows.Count
For i = numRows To 1 Step -1
    If (dst_wks.Range("A" & i).EntireRow.Hidden = True) Then
        dst_wks.Range("A" & i).Delete
    End If
Next i
Application.DisplayAlerts = True

复制整个数据不到两秒钟.但随后它再次挂起循环,并花费了超过10分钟的时间.

Copying whole data takes less than two seconds. But then it again hangs on for loop and takes more than 10 minutes.

推荐答案

另一种方法(执行此操作有多种方法)可以是使用SQL语句从所讨论的工作表中查询数据,然后将其复制到一张新纸.如果选择数据的条件变得更加复杂,则可能更可取.

An alternate approach (there are several ways to do this) could be to use a SQL statement to query the data from the sheet in question, then copy it to a new sheet. This might be preferable if the conditions for selecting data become more complex.

我在Sheet1上设置了这样的数据:

I have my data setup like this on Sheet1:

代码

Option Explicit
Private Const adCmdText As Long = 1
Private Const adStateOpen As Long = 1

Public Sub DisplayView(Conditions As String)
    Dim dbField       As Variant
    Dim fieldCounter  As Long
    Dim dbConnection  As Object
    Dim dbRecordset   As Object
    Dim dbCommand     As Object
    Dim OutputSheet   As Excel.Worksheet

    Set dbConnection = CreateObject("ADODB.Connection")
    Set dbRecordset = CreateObject("ADODB.Recordset")
    Set dbCommand = CreateObject("ADODB.Command")

    Set OutputSheet = ThisWorkbook.Worksheets("Sheet2")

    'Do a quick check to determine the correct connection string
    'if one of these don't work, have a look here --> https://www.connectionstrings.com/excel/
    If Left$(ThisWorkbook.FullName, 4) = "xlsm" Then
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0 Macro;HDR=YES';"
    Else
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0;HDR=YES';"
    End If

    'Open the connection and query
    dbConnection.Open
    With dbCommand
        .ActiveConnection = dbConnection
        .CommandType = adCmdText
        .CommandText = "Select * from [Sheet1$] where Column1 in (" & Conditions & ")" 'Update Sheet where applicable
        Set dbRecordset = .Execute
    End With

    'Clear the Output Sheet
    OutputSheet.Cells.Clear

    'Add Headers to output
    For Each dbField In dbRecordset.Fields
        fieldCounter = fieldCounter + 1
        OutputSheet.Cells(1, fieldCounter).Value2 = dbField.Name
    Next

    'Dump the found records
    OutputSheet.Range("A2").CopyFromRecordset dbRecordset
    If dbConnection.State = adStateOpen Then dbConnection.Close
End Sub

'Run from here
Public Sub ExampleRunner()
    Dim t As Double
    t = Timer
    DisplayView "'value1','value2','value3'" 'Send it a quoted csv of values you are looking for
    Debug.Print "Getting data took: " & Timer - t & " seconds"
End Sub

这在我的计算机上花费了大约4-5秒的时间才能从100,000个数据集的总大小中撤回几千条记录.

This is taking about 4-5 seconds on my machine to pull back a few thousand records from a total data set size of 100,000.

这篇关于优化复制过滤数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆