需要更好的优化代码? [英] Need a better optimized code?

查看:121
本文介绍了需要更好的优化代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要一个很优化的代码。我有一个项目,我已经成功地使其与vba一起工作(大多数由stackoverflow程序员帮助谢谢)
但今天我得到了一个反馈。它在记录中删除2个更多的唯一条目但我不知道为什么它删除他们。



我已经应用了算法

我使用了我在Google上找到的COUNTIF函数

  =countif(A $ 1:A2,A3)= 0A3是活动单元格,检查A2,A1为副本

如果有复制在A列和True如果它是一个唯一的。我已经了解了Countif是
它检查所有上述列值从该单元格我的意思是让我们拿A4。所以它检查A2,A1,A3的副本。类似地,A10检查A1到A9,并抛出TRue或False.Well它正在工作但我不知道发生了什么错误代码对某些条目不起作用。它甚至有时为Unique条目显示False。



当我有更多的数据量时,它花费更多的时间来应用这些公式。我试图使它更清洁和更优化的方式。人们告诉我,它不是ac或一些其他语言,使其优化,但我需要的代码,使我的代码更优化



我需要这些条款的代码,任何人都可以帮助我,因为我的countif失败了。这样做很无奈。



1)我有一列,我应该检查重复在该列中删除该行,如果它是重复的



2)我有35000个老条目列,我有新的条目2000每周这些附加。我需要从总计37000中检查这2000个条目(正如我们所提到的那样,我们得到35000 + 2000),这些删除操作只需要在新添加的2000个条目上执行,但它应该检查整个列的重复项



让我清楚地解释一下,我添加了2000个条目,所以只有这些条目要从35000个条目和本身(2000个条目)中检查重复项,并删除它是一个重复的,不能在35000条目旧数据上执行重复操作。



我找到了一些代码,但是即使删除了35000个条目的副本。我已经设置了范围,但即使它不工作。
任何人都可以帮助我最好的代码,花费更少的时间?请谢谢你



使用示例代码更新我的问题我有

  ABFGHIY 
PTY 39868.5 4 2 540 3 PTY39868.5425403
GTY 34446.1234 2 1 230 1 GTY34446.1234212301
PTY 3945.678 2 2 PTY3945.67822
GTY 34446.1234 2 1 230 1 GTY34446.1234212301
让我们说这些是旧35000条目

上述示例的解释。



以上是35000个条目。我必须检查A,B,F,G,H,I列的副本,如果它们是相同的,我必须删除行,我不应该打扰其他列c,d等等,所以我做的是我使用一个未使用的列Y,并使用这些

  = A2& B2& F2& G2& H2和I2与相应列

现在检查Y列是否存在复制,并删除整行。因为2003只支持我所知的一列。



请注意,即使35000个条目可能有重复的内容,但我不应该删除它们。例如,您可以看到我的示例代码中的第二行和第二行是重复的,但是我不应该删除
,因为它是旧数据。

 code> ABFGHIY 
PTY 39868.5 4 2 540 3 PTY39868.5425403'old
GTY 34446.1234 2 1 230 1 GTY34446.1234212301'old
PTY 3945.678 2 2 PTY3945.67822'old
GTY 34446.1234 2 1 230 1 GTY34446.1234212301'旧
PTY 3945.678 1 1 230 2 PTY3945.678112302'新
PTY 39868.5 4 2 540 3 PTY39868.5425403'新
PTY 3945.678 1 1 230 2 PTY3945.678112302'new

现在注意,新条目PTY(从上一个2)原始记录的副本(PTY起初)所以我hava t o删除它。最后一个新条目与新条目本身重复,所以我应该删除它。在上面的代码中,我只能删除最后2行,这是原始记录的重复,也是从它删除。但是不应该删除那个笨蛋的GTY,而是在原始记录中。



我想我现在就清楚了一下。将它们连接成一个单元格。是更好的方法吗?作为conactenatin为40000条目只需要2秒我认为这并不重要,但任何更多的算法对这些是非常aprreciated



我听到counif对待45.00和45.00000不同的是那个权利可能是这个问题吗?因为我的数据中有小数点。我想我应该做

  = I2& H2& G2& F2& A2& B2 

哪个更好连接?这个还是我之前发过的?

解决方案

现在我们有更多的信息是一个解决方案。它应该几乎立即执行。



代码通过使用连接公式填充列y来实现。然后将列y中的所有列添加到字典中,并使用字典将每行作为列z中的重复。然后,它删除第35000行后的所有重复项。最后,它清除列y和列z以删除冗余数据。

  Sub RemoveDuplicates()
Dim vData As Variant,vArray As Variant
Dim lRow As Long

'//获取列A(不包括标题)和偏移量的使用范围列y
使用ActiveSheet.Range(A2,Cells(Rows.Count,A)。End(xlUp))。Offset(,24)
'//将连接公式添加到表格列(y)
.FormulaR1C1 == RC [-24]& RC [-23]& RC [-19]& RC [-18]& RC [-17]& RC [-16]
'//将公式结果添加到数组
vData = .Resize(,1).value
结束

'//将数组重新定义为正确的大小
ReDim vArray(1到UBound(vData,1),0)

'//使用后期绑定创建一个字典对象
使用CreateObject (Scripting.Dictionary)
'//循环遍历每行在数组
对于lRow = 1到UBound(vData,1)
'//检查数组中是否存在
如果不是.exists(vData(lRow,1))Then
'//值不存在不重复的标记。
vArray(lRow,0)=x
'//将值添加到字典
.Add vData(lRow,1),Nothing
结束如果
下一步lRow
结束

'//关闭屏幕更新以加快代码并防止屏幕闪烁
Application.ScreenUpdating = False

With ActiveSheet
'//使用数组
.Range(Z2)填充列z。调整大小(UBound(vArray,1))= vArray
'//使用错误处理作为speciallcell引发错误当没有存在。
On Error Resume Next
'//删除列z
.Range中的所有空白单元格(Y35001,.Cells(Rows.Count,Y)。End(xlUp)) .Offset(,1).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
'//删除错误处理
错误GoTo 0
'//清除列y和z
。列(25).Resize(,2).ClearContents
结束

'//重新打开屏幕更新。
Application.ScreenUpdating = True
End Sub

注意:您可以更改所有



注意:它假定您有标题,并且仅剩下第1行。



我尽可能地使用了你的列和测试数据。这是我使用的测试填充:

  Sub TestFill()

对于i = 1到37000
With Range(A& i)
.value = Choose(Int(2 * Rnd + 1),PTY,GTY)
.Offset(,1) .value = Round((40000 *(Rnd + 1)),Choose(Int(4 * Rnd + 1),1,2,3,4))
.Offset(,5).value = Int 4 * Rnd + 1)
.Offset(,6).value = Int(2 * Rnd + 1)
.Offset(,7).value = Choose(Int(2 * Rnd + 1) ,230,540)
.Offset(,8).value = Int(3 * Rnd + 1)
结束
下一个i

End Sub


Need a much Optimized code.Well I Got a Project and I have Succefully made it work with the vba (Mostly helped by the stackoverflow programmers Thanks for that) But Today I got a Feedback. Its deleting 2 more unique entries in the record But I dont know why its deleting Them.

The Algorithm I have applied

I have Used the COUNTIF function Which I found on google

    ="countif(A$1:A2,A3)=0" A3 is the active cell, Checks A2,A1 for dupes

It Throws False if there is a duplicate in The A column and True If it is a unique.What I have understood about Countif is that It checks all the above columns values from that cell I mean let us take A4. SO it checks A2,A1,A3 for the duplicate. Similarly A10 checks for A1 to A9 and throws either TRue or False.Well It was working But I dont know what went wrong The code is not working for some entries.Its even showing False for the Unique entries sometimes.

And its taking more time to applye these formula as I have more amount of data. Im trying to make it cleaner and more Optimizing Way.People told me its not a c or some other Language to make it optimize but Im need of code that makes my code more optimized

I need code for these condtions can anyone help me as my countif failed.Im little helpless in doing so.

1)I have a column and I should check for duplicates in that column and delete that row if it is a duplicate

2) I have 35000 Old entries in the column and I have new entries 2000 everyweek these are appended. I need to check these 2000 entries from the total 37000 ( as we appened we get 35000+2000) and these delete operation need to be performed only on the newly appended 2000 entries but it should check the duplicates for entire column

Let me explain you clearly I have 2000 entries newly added,so Only these entries are to be checked for the duplicates from the 35000 entries and also from itself (2000 entries) and delete it if it is a duplicate and no duplicating operation should be performed on the 35000 entries old data.

I have found some codes but they are deleting even the duplicates of the 35000 entries. I have set the range but even though its not working. Can anyone help me with the best code that takes less time?please thank you

Updating my question with the sample code I have

   A       B            F       G        H       I              Y          
  PTY   39868.5         4       2       540      3      PTY39868.5425403 
  GTY   34446.1234      2       1       230      1      GTY34446.1234212301
  PTY   3945.678                2                2       PTY3945.67822
  GTY   34446.1234      2       1       230      1      GTY34446.1234212301
                  let us say these are old 35000 entries

Explaination to the above example.

The above are the 35000 entries. I have to check A,B,F,G,H,I columns for the dupes, if they are same I have to delete the row, I should not bother about the other columns c,d etc. so what I did is I have used one unused column Y and concatenated these 6 columns values into 1 at Y column using these

  = A2 & B2 & F2 & G2 & H2 &I2 with the respective columns

Now checking the Y column for dupes and delete the entire row. as 2003 supports only for one column as far to my knowledge.

Notice that even the 35000 entries may have duplicates in it but I should not delete them. Example you can see the 2 and last row in my example code are dupes but I should not delete as it is the old data.

   A       B            F       G        H       I              Y          
  PTY   39868.5         4       2       540      3      PTY39868.5425403     'old 
  GTY   34446.1234      2       1       230      1      GTY34446.1234212301   'old
  PTY   3945.678                2                2       PTY3945.67822        'old
  GTY   34446.1234      2       1       230      1      GTY34446.1234212301    'old
  PTY    3945.678       1       1       230      2      PTY3945.678112302      'new
  PTY    39868.5        4       2       540      3      PTY39868.5425403       'new 
  PTY    3945.678       1       1       230      2      PTY3945.678112302      'new

Now note that New entry PTY (from last 2nd) is a duplicate of the original record(PTY at first) So I hava to delete it.And the last new entry is a duplicate of the new entry itself so I should delete it even that . SO in the above code I have to delete only the last 2 rows which are dupes of original record and also from it . But should not delete the GTY which is the dupe but which is in orginal record.

I think I gave a clear view now. Is concatenating them into one cell . Is it better way to approach? as conactenatin for 40000 entries taking just 2 seconds i think that doesnt matter but any more algorithms to these is much aprreciated

I heard counif treats 45.00 and 45.00000 as different is that right may be that was the problem with it? since I have decimal points in my data. I think I should do

    = I2 & H2 & G2 & F2 & A2 & B2

which is better to concatenate? is this or the other i posted before?

解决方案

Okay so now we have some more info here is a solution. It should execute almost instantly.

The code works by filling column y with your concatenate formula. It then adds all of column y to a dictionary and using the dictionary marks each row as a duplicate in column z. It then removes all the duplicates found after row 35000. Then finally it clears both column y and column z to remove the redundant data.

Sub RemoveDuplicates()
    Dim vData As Variant, vArray As Variant
    Dim lRow As Long

    '// Get used range of column A (excluding header) and offset to get column y 
    With ActiveSheet.Range("A2", Cells(Rows.Count, "A").End(xlUp)).Offset(, 24)
        '// Adds the concatenate formula to the sheet column (y)
        .FormulaR1C1 = "=RC[-24]&RC[-23]&RC[-19]&RC[-18]&RC[-17]&RC[-16]"
        '// Adds the formula results to an array
        vData = .Resize(, 1).value
    End With

    '// Re dimension the array to the correct size 
    ReDim vArray(1 To UBound(vData, 1), 0)

    '// Create a dictionary object using late binding
    With CreateObject("Scripting.Dictionary")
        '// Loop through each row in the array
        For lRow = 1 To UBound(vData, 1)
            '// Check if value exists in the array
            If Not .exists(vData(lRow, 1)) Then
                '// Value does not exist mark as non duplicate.
                vArray(lRow, 0) = "x"
                '//  Add value to dictionary
                .Add vData(lRow, 1), Nothing
            End If
        Next lRow
    End With

    '// Turn off screen updating to speed up code and prevent screen flicker
    Application.ScreenUpdating = False    

    With ActiveSheet
        '// Populate column z with the array
        .Range("Z2").Resize(UBound(vArray, 1)) = vArray
        '// Use error handling as speciallcells throws an error when none exist.
        On Error Resume Next
        '// Delete all blank cells in column z
        .Range("Y35001", .Cells(Rows.Count, "Y").End(xlUp)).Offset(, 1).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
        '// Remove error handling
        On Error GoTo 0
        '// Clear columns y and z
        .Columns(25).Resize(, 2).ClearContents
    End With

   '// Turn screen updating back on.
   Application.ScreenUpdating = True
End Sub

NOTE: you can change all references "activesheet" to your sheet codename if you want.

NOTE2: it assumes you have headers and has left row 1 alone.

I have used your columns and test data as best I can. Here is the test fill I used:

Sub TestFill()

    For i = 1 To 37000
        With Range("A" & i)
            .value = Choose(Int(2 * Rnd + 1), "PTY", "GTY")
            .Offset(, 1).value = Round((40000 * (Rnd + 1)), Choose(Int(4 * Rnd + 1), 1, 2, 3, 4))
            .Offset(, 5).value = Int(4 * Rnd + 1)
            .Offset(, 6).value = Int(2 * Rnd + 1)
            .Offset(, 7).value = Choose(Int(2 * Rnd + 1), "230", "540")
            .Offset(, 8).value = Int(3 * Rnd + 1)
        End With
    Next i

End Sub

这篇关于需要更好的优化代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆